Isolater - Feed

Ax Minqi Jiang, Andrei Lupu, Yoram Bachrach 2/24/2026

Bootstrapping Task Spaces for Self-Improvement

Presents Exploratory Iteration (ExIt), RL methods enabling agents to self-improve through iterative refinement without fixed iteration limits.

Ax Wei Chen, Yuqian Wu, Yuanshao Zhu, Xixuan Hao, Shiyu Wang, Xiaofang Zhou, Yuxuan Liang 2/24/2026

Select, then Balance: Exploring Exogenous Variable Modeling of Spatio-Temporal Forecasting

Explores exogenous variable modeling in spatio-temporal forecasting systems to improve prediction accuracy.

Ax Geon Lee, Bhuvesh Kumar, Clark Mingxuan Ju, Tong Zhao, Kijung Shin, Neil Shah, Liam Collins 2/24/2026

Sequential Data Augmentation for Generative Recommendation

Data augmentation strategies for generative recommendation systems improving generalization in sequential user behavior prediction.

Ax Niccol\`o Rocchi, Fabio Stella, Cassio de Campos 2/24/2026

Towards Privacy-Aware Bayesian Networks: A Credal Approach

Privacy-aware Bayesian network approach using credal sets for secure public release of probabilistic graphical models.

Ax Yiyuan Pan, Zhe Liu, Hesheng Wang 2/24/2026

Wonder Wins Ways: Curiosity-Driven Exploration through Multi-Agent Contextual Calibration

Multi-agent RL with curiosity-driven exploration using contextual calibration to distinguish novelty from environmental stochasticity.

Ax Yinuo Ren, Wenhao Gao, Lexing Ying, Grant M. Rotskoff, Jiequn Han 2/24/2026

DriftLite: Lightweight Drift Control for Inference-Time Scaling of Diffusion Models

DriftLite: Training-free particle-based approach for inference-time diffusion model adaptation to new distributions.

Ax Shirin Alanova, Kristina Kazistova, Ekaterina Galaeva, Alina Kostromina, Vladimir Smirnov, Redko Dmitry, Alexey Dontsov, Maxim Zhelnin, Evgeny Burnaev, Egor Shvetsov 2/24/2026

Lightweight error mitigation strategies for post-training N:M activation sparsity in LLMs

Error mitigation methods for post-training N:M activation sparsity in LLMs enabling dynamic input-adaptive compression.

Ax Xingjian Wu, Jianxin Jin, Wanghui Qiu, Peng Chen, Yang Shu, Bin Yang, Chenjuan Guo 2/24/2026

Aurora: Towards Universal Generative Multimodal Time Series Forecasting

Aurora: Multimodal foundation model for cross-domain time series forecasting integrating text and temporal data.

Ax Narada Maugin, Tristan Cazenave 2/24/2026

SpinGPT: A Large-Language-Model Approach to Playing Poker Correctly

SpinGPT applies LLM approach to poker strategy, addressing CFR computational limits in multi-player game settings.

Ax Aman Gupta, Rafael Celente, Abhishek Shivanna, D. T. Braithwaite, Gregory Dexter, Shao Tang, Hiroto Udagawa, Daniel Silva, Rohan Ramanath, S. Sathiya Keerthi 2/24/2026

Effective Quantization of Muon Optimizer States

8-bit blockwise quantization of Muon optimizer states reducing memory overhead for large-scale LLM pretraining.

Ax Wei Wang, Dong-Dong Wu, Ming Li, Jingxiong Zhang, Gang Niu, Masashi Sugiyama 2/24/2026

Accessible, Realistic, and Fair Evaluation of Positive-Unlabeled Learning Algorithms

Framework for standardizing evaluation of positive-unlabeled learning algorithms under consistent experimental settings.

Ax Hao Chen, Tao Han, Jie Zhang, Song Guo, Lei Bai 2/24/2026

STCast: Adaptive Boundary Alignment for Global and Regional Weather Forecasting

Weather forecasting method using adaptive boundary alignment for regional and global predictions with spatial-temporal modeling.

Ax Jubayer Ibn Hamid, Ifdita Hasan Orney, Ellen Xu, Chelsea Finn, Dorsa Sadigh 2/24/2026

Polychromic Objectives for Reinforcement Learning

Polychromic objectives for RL fine-tuning preventing policy collapse and preserving diversity in pretrained model behaviors.

Ax Jaewoo Lee, Minsu Kim, Sanghyeok Choi, Inhyuck Song, Sujin Yun, Hyeongyu Kang, Woocheol Shin, Taeyoung Yun, Kiyoung Om, Jinkyoo Park 2/24/2026

Diffusion Alignment as Variational Expectation-Maximization

Diffusion Alignment as Variational EM framework addressing reward over-optimization and mode collapse in diffusion model alignment.

Ax Yuchen Cai, Ding Cao, Xin Xu, Zijun Yao, Yuqing Huang, Zhenyu Tan, Benyi Zhang, Guangzhong Sun, Guiquan Liu, Junfeng Fang 2/24/2026

On Predictability of Reinforcement Learning Dynamics for Large Language Models

Analysis of RL-induced parameter dynamics in LLMs revealing rank-1 dominance in reasoning improvements and predictability of training trajectories.

Ax Kwanhee Lee, Hyeondo Jang, Dongyeop Lee, Dan Alistarh, Namhoon Lee 2/24/2026

The Unseen Frontier: Pushing the Limits of LLM Sparsity with Surrogate-Free ADMM

Surrogate-free ADMM method for LLM pruning achieving >50% sparsity without accuracy degradation, breaking through conventional compression limits.

Ax Anirudh Subramanyam, Yuxin Chen, Robert L. Grossman 2/24/2026

Scaling Laws Revisited: Modeling the Role of Data Quality in Language Model Pretraining

Scaling law formalization incorporating data quality parameter for language model pretraining, extending traditional model/dataset size relationships.

Ax Xiangyu Shi, Marco Chiesa, Gerald Q. Maguire Jr., Dejan Kostic 2/24/2026

KVComm: Enabling Efficient LLM Communication through Selective KV Sharing

KVComm: Communication framework for multi-agent LLM systems using selective key-value sharing instead of natural language or hidden states.

Ax Nirjhar Das, Mohit Sharma, Praharsh Nanavati, Kirankumar Shiragur, Amit Deshpande 2/24/2026

Cost Efficient Fairness Audit Under Partial Feedback

Fairness auditing framework for classifiers with partial feedback using cost-aware data acquisition strategies.

Ax Philipp Becker, Niklas Freymuth, Serge Thilges, Fabian Otto, Gerhard Neumann 2/24/2026

TROLL: Trust Regions improve Reinforcement Learning for Large Language Models

TROLL: Trust region-based RL method improving upon PPO clipping for LLM fine-tuning, achieving more stable and optimal reward-based training.

Ax Yuchen Zhu, Wei Guo, Jaemoo Choi, Petr Molodyk, Bo Yuan, Molei Tao, Yongxin Chen 2/24/2026

Enhancing Reasoning for Diffusion LLMs via Distribution Matching Policy Optimization

Novel RL algorithm for diffusion LLMs using distribution matching policy optimization to improve reasoning capabilities and match autoregressive LLM performance.

Ax Razvan Marinescu, Victoria-Elisabeth Gruber, Diego Fajardo 2/24/2026

Medical Interpretability and Knowledge Maps of Large Language Models

Systematic interpretability study of five LLMs' medical knowledge using activation analysis and layer lesioning techniques.

Ax Amirhossein Mozafari, Kourosh Hashemi, Erfan Shafagh, Soroush Motamedi, Azar Taheri Tayebi, Mohammad A. Tayebi 2/24/2026

CleverCatch: A Knowledge-Guided Weak Supervision Model for Fraud Detection

Weak supervision model for healthcare fraud detection using knowledge-guided learning with limited labeled data.

Ax Jialin Lu, Kye Emond, Kaiyu Yang, Swarat Chaudhuri, Weiran Sun, Wuyang Chen 2/24/2026

Lean Finder: Semantic Search for Mathlib That Understands User Intents

Semantic search engine for Lean theorem prover mathlib using intent-aware ranking for theorem discovery.

Ax Yizuo Chen, Adnan Darwiche 2/24/2026

On the Granularity of Causal Effect Identifiability

Causal inference research on state-based identifiability of causal effects in treatment-outcome relationships.

Ax Seohong Park, Aditya Oberai, Pranav Atreya, Sergey Levine 2/24/2026

Transitive RL: Value Learning via Divide and Conquer

RL algorithm using divide-and-conquer for offline goal-conditioned reinforcement learning value estimation.

Ax Yinghuan Zhang, Yufei Zhang, Parisa Kordjamshidi, Zijun Cui 2/24/2026

Bayesian Network Structure Discovery Using Large Language Models

Uses LLMs as core component for Bayesian network structure discovery from data, replacing traditional structure learning methods that require extensive observational data.

Ax Arthur Chen, Zuxin Liu, Jianguo Zhang, Akshara Prabhakar, Zhiwei Liu, Shelby Heinecke, Silvio Savarese, Victor Zhong, Caiming Xiong 2/24/2026

Test-Time Adaptation for LLM Agents via Environment Interaction

Method for adapting LLM agents to novel environments through test-time interaction, addressing syntactic and semantic mismatches in observation formats and state dynamics.

Ax Hadi Reisizadeh, Jiajun Ruan, Yiwei Chen, Soumyadeep Pal, Sijia Liu, Mingyi Hong 2/24/2026

Leak@$k$: Unlearning Does Not Make LLMs Forget Under Probabilistic Decoding

Leak@k: study showing existing LLM unlearning methods fail under probabilistic decoding despite success under greedy decoding evaluation.

Ax Hamza Virk, Sandro Amaglobeli, Zuhayr Syed 2/24/2026

Blind Inverse Game Theory: Jointly Decoding Rewards and Rationality in Entropy-Regularized Competitive Games

Blind-IGT: inverse game theory method jointly decoding rewards and rationality in entropy-regularized competitive games with unknown rationality parameter.

Ax Bill Chunyuan Zheng, Vivek Myers, Benjamin Eysenbach, Sergey Levine 2/24/2026

Multistep Quasimetric Learning for Scalable Goal-conditioned Reinforcement Learning

Quasimetric learning method for goal-conditioned RL using multi-step returns to estimate temporal distance between observations over long horizons.

Ax Bernardo Perrone Ribeiro, Jana Faganeli Pucer 2/24/2026

FlowCast: Advancing Precipitation Nowcasting with Conditional Flow Matching

FlowCast: conditional flow matching method for radar-based precipitation nowcasting addressing uncertainty and high-dimensional data modeling.

Ax Zhenshuo Zhang, Minxuan Duan, Youran Ye, Hongyang R. Zhang 2/24/2026

Scalable Multi-Objective and Meta Reinforcement Learning via Gradient Estimation

Gradient estimation method for multi-objective and meta reinforcement learning, partitioning n objectives into k groups for language model preference optimization.

Ax Patryk Krukowski, Jan Miksa, Piotr Helm, Jacek Tabor, Pawe{\l} Wawrzy\'nski, Przemys{\l}aw Spurek 2/24/2026

InTAct: Interval-based Task Activation Consolidation for Continual Learning

InTAct: continual learning approach using interval-based task activation consolidation with mathematical guarantees against catastrophic forgetting.

Ax German Gritsai, Megan Richards, Maxime M\'eloux, Kyunghyun Cho, Maxime Peyrard 2/24/2026

MIST: Mutual Information Estimation Via Supervised Training

MIST: neural network-based mutual information estimator trained on 625K synthetic distributions with known ground-truth MI.

Ax Rui Xue, Shichao Zhu, Liang Qin, Tianfu Wu 2/24/2026

E2E-GRec: An End-to-End Joint Training Framework for Graph Neural Networks and Recommender Systems

E2E-GRec framework for end-to-end joint training of GNNs and recommender systems, replacing two-stage pipeline approach.

Ax Xiao Wu, Ting-Zhu Huang, Liang-Jian Deng, Xiaobing Yu, Yu Zhong, Shangqi Deng, Ufaq Khan, Jianghao Wu, Xiaofeng Liu, Imran Razzak, Xiaojun Chang, Yutong Xie 2/24/2026

SelfAI: A self-directed framework for long-horizon scientific discovery

SelfAI multi-agent system for self-directed long-horizon scientific discovery with human-in-the-loop workflows and exploration trade-offs.

Ax Yaswanth Chittepu, Raghavendra Addanki, Tung Mai, Anup Rao, Branislav Kveton 2/24/2026

ML-Tool-Bench: Tool-Augmented Planning for ML Tasks

ML-Tool-Bench framework for tool-augmented planning in autonomous ML agents orchestrating data analysis and model optimization workflows.

Ax Yifan Zhang, Zixiang Chen, Yifeng Liu, Zhen Qin, Huizhuo Yuan, Kangping Xu, Yang Yuan, Quanquan Gu, Andrew Chi-Chih Yao 2/24/2026

Group Representational Position Encoding

GRAPE framework unifying positional encoding mechanisms using group actions for multiplicative rotations and additive biases.

Ax Luca Miglior, Matteo Tolloso, Alessio Gravina, Davide Bacciu 2/24/2026

Can You Hear Me Now? A Benchmark for Long-Range Graph Propagation

ECHO benchmark for evaluating graph neural networks on long-range graph propagation and interaction tasks.

Ax Daniel M. Jimenez-Gutierrez, Mehrdad Hassanzadeh, David Solans, Mohammed Elbamby, Nicolas Kourtellis, Aris Anagnostopoulos, Ioannis Chatzigiannakis, Andrea Vitaletti 2/24/2026

Clust-PSI-PFL: A Population Stability Index Approach for Clustered Non-IID Personalized Federated Learning

Clustered personalized federated learning framework using Population Stability Index to handle non-IID data across clients.

Ax Mohammad Meymani, Roozbeh Razavi-Far 2/24/2026

Divided We Fall: Defending Against Adversarial Attacks via Soft-Gated Fractional Mixture-of-Experts with Randomized Adversarial Training

Soft-gated fractional mixture-of-experts with randomized adversarial training to defend ML models against adversarial attacks.

Ax Erin Carson, Xinye Chen 2/24/2026