Ax David Pardoe, Neil Daftary, Miro Furtado, Aditya Aiyer, Yu Wang, Liuqing Li, Tao Song, Lars Hertel, Young Jin Yun, Senthil Radhakrishnan, Zhiwei Wang, Tommy Li, Khai Tran, Ananth Nagarajan, Ali Naqvi, Yue Zhang, Renpeng Fang, Avi Romascanu, Arjun Kulothungun, Deepak Kumar, Praneeth Boda, Fedor Borisyuk, Ruoyan Wang 2/13/2026

CADET: Context-Conditioned Ads CTR Prediction With a Decoder-Only Transformer

CADET applies decoder-only transformer architecture to CTR prediction in ads systems, addressing challenges of contextual post-scoring constraints.

Ax Congmin Zheng, Xiaoyun Mo, Xinbei Ma, Qiqiang Lin, Yin Zhao, Jiachen Zhu, Xingyu Lou, Jun Wang, Zhaoxiang Wang, Weiwen Liu, Zhuosheng Zhang, Yong Yu, Weinan Zhang 2/13/2026

Adaptive Milestone Reward for GUI Agents

Adaptive Milestone Reward addresses temporal credit assignment in RL-trained GUI agents by balancing outcome and process reward with adaptive thresholds.

Ax Jingkun Liu, Yisong Yue, Max Welling, Yue Song 2/13/2026

Krause Synchronization Transformers

Krause Attention mechanism prevents representation collapse in transformers by decoupling softmax normalization inspired by bounded-confidence dynamics.

Ax Sisuo Lyu, Siru Zhong, Tiegang Chen, Weilin Ruan, Qingxiang Liu, Taiqiang Lv, Qingsong Wen, Raymond Chi-Wing Wong, Yuxuan Liang 2/13/2026

TS-Memory: Plug-and-Play Memory for Time Series Foundation Models

TS-Memory adds plug-and-play memory module to time series foundation models for efficient adaptation under distribution shift without catastrophic forgetting.

Ax Yair Schiff, Omer Belhasin, Roy Uziel, Guanghan Wang, Marianne Arriola, Gilad Turok, Michael Elad, Volodymyr Kuleshov 2/13/2026

Learn from Your Mistakes: Self-Correcting Masked Diffusion Models

Framework enabling masked diffusion models to perform token correction after unmasking, reducing error accumulation in parallel generation.

Ax Naveen Sahi, Jeremy Dohmann, Armen Aghajanyan, Akshat Shrivastava 2/13/2026

SkillRater: Untangling Capabilities in Multimodal Data

SkillRater decomposes data quality into multidimensional capabilities rather than single scores, improving data curation for model training.

Ax Shervin Ghasemlou 2/13/2026

Dopamine: Brain Modes, Not Brains

Parameter-efficient fine-tuning method viewing adaptation as neuromodulation-inspired mode selection and rescaling of pretrained computations.

Ax Cl\'audio Correia, Alberto E. A. Ferreira, Lucas Martins, Miguel P. Bento, Sofia Guerreiro, Ricardo Ribeiro Pereira, Ana Sofia Gomes, Jacopo Bono, Hugo Ferreira, Pedro Bizarro 2/13/2026

MUSE: Multi-Tenant Model Serving With Seamless Model Updates

Multi-tenant model serving system handling seamless model updates with dynamic decision threshold management.

Ax Sebastian Zeng, Andreas Petersson, Wolfgang Bock 2/13/2026

Latent-Variable Learning of SPDEs via Wiener Chaos

Method for learning stochastic partial differential equations from spatiotemporal observations using latent-variable formulation and deep learning.