Ax Jinzhou Tan, Gabriel Adineera, Jinoh Kim 3/10/2026

ProgAgent:A Continual RL Agent with Progress-Aware Rewards

ProgAgent: continual reinforcement learning agent using progress-aware reward learning from unlabeled expert videos, addresses catastrophic forgetting in robotic learning with JAX architecture.

Ax Caihao Sun, Mingqi Yuan, Shiyuan Wang, Jiayu Chen 3/10/2026

Vision Transformers that Never Stop Learning

arXiv paper investigating loss of plasticity in Vision Transformers for continual learning, examining why attention-based models struggle to adapt to new tasks over time.

Ax Zaid Abdullah, Merouane Debbah, Symeon Chatzinotas, Bjorn Ottersten 3/10/2026

Neural Precoding in Complex Projective Spaces

Deep learning approach for multi-user MIMO wireless precoding using complex projective space parameterization of neural network outputs.

Ax Th\'eo Vincent, Kevin Gerhardt, Yogesh Tripathi, Habib Maraqten, Adam White, Martha White, Jan Peters, Carlo D'Eramo 3/10/2026

Gradient Iterated Temporal-Difference Learning

Temporal-difference reinforcement learning algorithm that incorporates gradients of bootstrapped estimates to improve stability over semi-gradient approaches.

Ax Abduragim Shtanchaev, Albina Ilina, Yazid Janati, Arip Asadulaev, Martin Tak\'ac, Eric Moulines 3/10/2026

Guess & Guide: Gradient-Free Zero-Shot Diffusion Guidance

Gradient-free guidance method for diffusion models in Bayesian inverse problems avoiding computationally expensive vector-Jacobian products.

Ax Boris Kriuk, Fedor Kriuk 3/10/2026

PSTNet: Physically-Structured Turbulence Network

PSTNet estimates atmospheric turbulence intensity using physics-structured ML models respecting conservation laws for real-time aircraft safety applications.

Ax Qianyu Yang, Yang Liu, Jiaqi Li, Jun Bai, Hao Chen, Kaiyuan Chen, Tiliang Duan, Jiayun Dong, Xiaobo Hu, Zixia Jia, Yang Liu, Tao Peng, Yixin Ren, Ran Tian, Zaiyuan Wang, Yanglihong Xiao, Gang Yao, Lingyue Yin, Ge Zhang, Chun Zhang, Jianpeng Jiao, Zilong Zheng, Yuan Gong 3/10/2026

\$OneMillion-Bench: How Far are Language Agents from Human Experts?

$OneMillion-Bench evaluates language agents on 400 expert-curated real-world tasks across Law, Finance, Healthcare, Industry, and Science requiring multi-step reasoning and tool use.

Ax Bhavesh Kumar, Dylan Feng, Leonard Tang 3/10/2026

MJ1: Multimodal Judgment via Grounded Verification

MJ1 is a multimodal judge trained with RL to enforce visual grounding through structured verification chains and counterfactual consistency rewards.

Ax Paulius Rauba, Claudio Fanconi, Mihaela van der Schaar 3/10/2026

Tiny Autoregressive Recursive Models

Explores autoregressive tiny recursive models for general prediction tasks, extending TRM mechanism beyond ARC-AGI to support iterative refinement in diverse domains.

Ax Mingxi Zou, Jiaxiang Chen, Junfan Li, Langzhang Liang, Qifan Wang, Xu Yinghui, Zenglin Xu 3/10/2026

DARC: Disagreement-Aware Alignment via Risk-Constrained Decoding

DARC proposes an inference-time method for aligning LLMs with heterogeneous human preferences by framing response selection as a risk-constrained decoding problem, avoiding retraining.

Ax Jiayang Gao, Tianyi Zheng, Jiayang Zou, Fengxiang Yang, Shice Liu, Luyao Fan, Zheyu Zhang, Hao Zhang, Jinwei Chen, Peng-Tao Jiang, Bo Li, Jia Wang 3/10/2026

C$^2$FG: Control Classifier-Free Guidance via Score Discrepancy Analysis

Theoretical analysis of classifier-free guidance in diffusion models with adaptive score discrepancy-based control for better conditional generation.

Ax Sidharth Sinha, Anson Bastos, Xuchao Zhang, Akshay Nambi, Chetan Bansal, Saravan Rajmohan 3/10/2026

AutoAdapt: An Automated Domain Adaptation Framework for LLMs

AutoAdapt automated framework for domain adaptation in LLMs, handling hyperparameter selection and evolving knowledge without manual tuning.

Ax Chang Li, Tshihao Tsu, Yaren Zhang, Chao Xue, Xiaodong He 3/10/2026

Fibration Policy Optimization

Fibration Policy Optimization introduces APC-Obj for training heterogeneous LLM systems with multi-scale hierarchical stability control.