Isolater - Feed

Ax Mustafa Arslan 2/17/2026

Aeon: High-Performance Neuro-Symbolic Memory Management for Long-Horizon LLM Agents

Aeon: neuro-symbolic memory management system for long-horizon LLM agents addressing context window and attention cost limitations.

Ax Yongxin Deng, Zhen Fang, Sharon Li, Ling Chen 2/17/2026

Beyond In-Domain Detection: SpikeScore for Cross-Domain Hallucination Detection

SpikeScore: hallucination detection method for LLMs with improved cross-domain generalization.

Ax Hao Shen, Hang Yang, Zhouhong Gu 2/17/2026

ScholarGym: Benchmarking Large Language Model Capabilities in the Information-Gathering Stage of Deep Research

ScholarGym: benchmark for evaluating LLM capabilities in information-gathering stage of deep research systems.

Ax Shuo Liu, Tianle Chen, Ryan Amiri, Christopher Amato 2/17/2026

Learning Decentralized LLM Collaboration with Multi-Agent Actor Critic

Learning decentralized LLM collaboration using multi-agent reinforcement learning without centralized execution protocols.

Ax Hyejun Jeong, Amir Houmansadr, Shlomo Zilberstein, Eugene Bagdasarian 2/17/2026

Persuasion Propagation in LLM Agents

Study on persuasion propagation: how belief-level intervention affects downstream behavior in LLM agents executing long-horizon tasks.

Ax Salaheddin Alzu'bi, Baran Nama, Arda Kaz, Anushri Eswaran, Weiyuan Chen, Sarvesh Khetan, Rishab Bala, Tu Vu, Sewoong Oh 2/17/2026

ROMA: Recursive Open Meta-Agent Framework for Long-Horizon Multi-Agent Systems

ROMA: recursive framework for long-horizon multi-agent tasks using task decomposition and structured aggregation to handle context limits and execution complexity.

Ax Shifat E. Arman, Syed Nazmus Sakib, Tapodhir Karmakar Taton, Nafiul Haque, Shahrear Bin Amin 2/17/2026

PATHWAYS: Evaluating Investigation and Context Discovery in AI Web Agents

PATHWAYS: benchmark of 250 web agent tasks evaluating ability to discover and use hidden contextual information across closed/open models.

Ax Zhangquan Chen, Jiale Tao, Ruihuang Li, Yihao Hu, Ruitao Chen, Zhantao Yang, Xinlei Yu, Haodong Jing, Manyuan Zhang, Shuai Shao, Biao Wang, Qinglin Lu, Ruqi Huang 2/17/2026

OmniVideo-R1: Reinforcing Audio-visual Reasoning with Query Intention and Modality Attention

OmniVideo-R1: framework for improving audio-visual reasoning in multimodal models using query intention and modality attention.

Ax Alisia Lupidi, Bhavul Gauri, Thomas Simon Foster, Bassel Al Omari, Despoina Magka, Alberto Pepe, Alexis Audran-Reiss, Muna Aghamelu, Nicolas Baldwin, Lucia Cipolina-Kun, Jean-Christophe Gagnon-Audet, Chee Hau Leow, Sandra Lefdal, Hossam Mossalam, Abhinav Moudgil, Saba Nazir, Emanuel Tewolde, Isabel Urrego, Jordi Armengol Estape, Amar Budhiraja, Gaurav Chaurasia, Abhishek Charnalia, Derek Dunfield, Karen Hambardzumyan, Daniel Izcovich, Martin Josifoski, Ishita Mediratta, Kelvin Niu, Parth Pathak, Michael Shvartsman, Edan Toledo, Anton Protopopov, Roberta Raileanu, Alexander Miller, Tatiana Shavrina, Jakob Foerster, Yoram Bachrach 2/17/2026

AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents

AIRS-Bench: benchmark of 20 ML research tasks for evaluating AI agent capabilities across language modeling, mathematics, bioinformatics, and time series forecasting.

Ax Xin Wang, Hualin Zhou, Sheng Guang Wang, Ting Dang, Yu Zhang, Hong Jia, Tao Gu 2/17/2026

LQA: A Lightweight Quantized-Adaptive Framework for Vision-Language Models on the Edge

LQA framework for deploying vision-language models on edge devices using quantization and gradient-free test-time adaptation.

Ax Igor Santos-Grueiro 2/17/2026

When Evaluation Becomes a Side Channel: Regime Leakage and Structural Mitigations for Alignment Assessment

Analyzes regime leakage in AI safety evaluation where situational-aware agents exploit differences between evaluation and deployment.

Ax John Muchovej, Amanda Royka, Shane Lee, Julian Jara-Ettinger 2/17/2026

GPT-4o Lacks Core Features of Theory of Mind

Tests whether GPT-4o possesses Theory of Mind via causal model evaluation, finding it lacks core ToM representations.

Ax Khiem H. Le, Hieu H. Pham, Thao BT. Nguyen, Tu A. Nguyen, Tien N. Thanh, Cuong D. Do 2/17/2026

LightX3ECG: A Lightweight and eXplainable Deep Learning System for 3-lead Electrocardiogram Classification

Lightweight explainable deep learning system for ECG classification with three leads for cardiovascular disease detection.

Ax Haoran Li, XiaoLu Li, Yihang Lin, Yanbin Hao, Haiyong Xie, Pengyuan Zhou, Yong Liao 2/17/2026

TKN: Transformer-based Keypoint Prediction Network For Real-time Video Prediction

TKN network for real-time video prediction using transformer-based keypoint detection with reduced computation and memory.

Ax Vincent Liu, Prabhat Nagarajan, Andrew Patterson, Martha White 2/17/2026

When is Offline Policy Selection Sample Efficient for Reinforcement Learning?

Analyzes fundamental limits of offline policy selection in reinforcement learning through sample efficiency perspective.

Ax Yong Liu, Zirui Zhu, Chaoyu Gong, Minhao Cheng, Cho-Jui Hsieh, Yang You 2/17/2026

Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning

Sparse MeZO improves memory-efficient zeroth-order LLM fine-tuning by using sparse parameter updates during training.

Ax Sunny Sanyal, Ravid Shwartz-Ziv, Alexandros G. Dimakis, Sujay Sanghavi 2/17/2026

When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models

Identifies attention collapse in LLM deeper layers and introduces Inheritune method to create smaller, more efficient models.

Ax Akira Kitaoka 2/17/2026

Exact Solution to Data-Driven Inverse Optimization of MILPs in Finite Time via Gradient-Based Methods

Proposes gradient-based methods for data-driven inverse optimization of mixed integer linear programs.

Ax Luca Castri, Gloria Beraldo, Sariah Mghames, Marc Hanheide, Nicola Bellotto 2/17/2026

Experimental Evaluation of ROS-Causal in Real-World Human-Robot Spatial Interaction Scenarios

Evaluates ROS-Causal, a causal inference implementation for human-robot spatial interaction in real-world scenarios.

Ax Shenghui Li, Fanghua Ye, Meng Fang, Jiaxu Zhao, Yun-Hin Chan, Edith C. H. Ngai, Thiemo Voigt 2/17/2026

Synergizing Foundation Models and Federated Learning: A Survey

Survey of synergy between Foundation Models and Federated Learning, covering FMs adapted for distributed learning scenarios.

Ax Shengyuan Ye, Bei Ouyang, Tianyi Qian, Liekang Zeng, Jingyi Li, Jiangsu Du, Xiaowen Chu, Guoliang Xing, Xu Chen 2/17/2026

Resource-Efficient Personal Large Language Models Fine-Tuning with Collaborative Edge Computing

Framework for resource-efficient edge-based fine-tuning of personal LLMs using collaborative edge computing while preserving privacy.

Ax Zhaomin Wu, Jizhou Guo, Junyi Hou, Bingsheng He, Lixin Fan, Qiang Yang 2/17/2026

Model-based Large Language Model Customization as Service

Proposes differentially private customization service for LLMs that enables domain-specific fine-tuning without uploading user data.

Ax Tianyu Chen, Shuai Lu, Shan Lu, Yeyun Gong, Chenyuan Yang, Xuheng Li, Md Rakib Hossain Misu, Hao Yu, Nan Duan, Peng Cheng, Fan Yang, Shuvendu K Lahiri, Tao Xie, Lidong Zhou 2/17/2026

Automated Proof Generation for Rust Code via Self-Evolution

SAFE framework automates formal proof generation for Rust code using LLMs via self-evolution to overcome proof data scarcity.

Ax Gene Yu, Ce Guo, Wayne Luk 2/17/2026

VCDF: A Validated Consensus-Driven Framework for Time Series Causal Discovery

Introduces VCDF, a method-agnostic framework for time series causal discovery that improves robustness across temporal subsets.

Ax Kaizhao Liang, Lizhang Chen, Bo Liu, Qiang Liu 2/17/2026

Cautious Optimizers: Improving Training with One Line of Code

Proposes one-line PyTorch modification to momentum-based optimizers creating cautious optimizers (C-AdamW, C-Lion) for improved transformer pretraining.

Ax Muhammad Fetrat Qharabagh, Mohammadreza Ghofrani, Kimon Fountoulakis 2/17/2026

LVLM-COUNT: Enhancing the Counting Ability of Large Vision-Language Models

Evaluates and improves counting abilities of large vision-language models across multiple visual datasets and benchmarks.

Ax Nianze Tao, Minori Abe 2/17/2026

Bayesian Flow Is All You Need to Sample Out-of-Distribution Chemical Spaces

Introduces ChemBFN model using Bayesian flow networks for generating novel molecules outside training distribution for drug design.

Ax Jiacheng Cui, Zhaoyi Li, Xiaochen Ma, Xinyue Bi, Yaxin Luo, Zhiqiang Shen 2/17/2026

Dataset Distillation via Committee Voting

Proposes CV-DD, a committee voting approach for dataset distillation to create compact representative datasets for efficient model training.

Ax Federico Errica, Henrik Christiansen, Viktor Zaverkin, Mathias Niepert, Francesco Alesiani 2/17/2026

Adaptive Width Neural Networks

Technique for learning neural network layer width during training without manual hyperparameter tuning or architecture search.

Ax Xun Deng, Han Zhong, Rui Ai, Fuli Feng, Zheng Wang, Xiangnan He 2/17/2026

Less is More: Improving LLM Alignment via Preference Data Selection

Method improving Direct Preference Optimization through margin-maximization data selection to address parameter shrinkage from noisy annotations.

Ax Sho Nakatani (SecDevLab Inc.) 2/17/2026

RapidPen: Fully Automated IP-to-Shell Penetration Testing with LLM-based Agents

RapidPen: autonomous penetration testing framework using LLM agents to discover and exploit vulnerabilities from IP addresses.

Ax Thomas Hickling, Maxwell Hogan, Abdulla Tammam, Nabil Aouf 2/17/2026

Deep Reinforcement Learning based Autonomous Decision-Making for Cooperative UAVs: A Search and Rescue Real World Application

Deep reinforcement learning framework for autonomous multi-UAV coordination in GNSS-denied search and rescue.

Ax Sharan Mourya, Hannes Leipold, Bibhas Adhikari 2/17/2026

Contextual Quantum Neural Networks for Stock Price Prediction

Application of quantum machine learning to stock price prediction using contextual quantum neural networks.

Ax Yuqi Hu, Longguang Wang, Xian Liu, Ling-Hao Chen, Yuwei Guo, Yukai Shi, Ce Liu, Anyi Rao, Zeyu Wang, Hui Xiong 2/17/2026

Simulating the Real World: A Unified Survey of Multimodal Generative Models

Survey of multimodal generative models for capturing world dynamics across 2D, 3D, video, and 4D representations.

Ax Seongho Son, William Bankes, Sangwoong Yoon, Shyam Sundhar Ramesh, Xiaohang Tang, Ilija Bogunovic 2/17/2026

Robust Multi-Objective Controlled Decoding of Large Language Models

RMOD: inference-time algorithm aligning LLMs to multiple objectives via robust decoding using maximin game theory.

Ax Zongyue Qin, Shichang Zhang, Mingxuan Ju, Tong Zhao, Neil Shah, Yizhou Sun 2/17/2026

Heuristic Methods are Good Teachers to Distill MLPs for Graph Link Prediction

ML research on distilling Graph Neural Networks into MLPs for link prediction using heuristic teacher methods.

Ax Prothit Sen, Sai Mihir Jakkaraju 2/17/2026

Modeling AI-Human Collaboration as a Multi-Agent Adaptation

Agent-based simulation formalizing AI-human collaboration by modeling distinct optimization and satisficing decision heuristics.

Ax Chihao Shen, Connor Dilgren, Purva Chiniya, Luke Griffith, Yu Ding, Yizheng Chen 2/17/2026

SecRepoBench: Benchmarking Code Agents for Secure Code Completion in Real-World Repositories

SecRepoBench: benchmark evaluating code agents and LLMs on secure code completion across real-world C/C++ repositories covering 15 CWEs.

Ax Abhijit Gupta 2/17/2026

Sparse Latent Factor Forecaster (SLFF) with Iterative Inference for Transparent Multi-Horizon Commodity Futures Prediction

Sparse Latent Factor Forecaster model for multi-horizon commodity futures prediction with iterative inference.

Ax Xianrui Zhong, Bowen Jin, Siru Ouyang, Yanzhen Shen, Qiao Jin, Yin Fang, Zhiyong Lu, Jiawei Han 2/17/2026