Ax Yebo Wu, Chunlin Tian, Jingguang Li, He Sun, Kahou Tam, Zhanting Zhou, Haicheng Liao, Jing Xiong, Zhijiang Guo, Li Li, Chengzhong Xu 2/25/2026

A Survey on Federated Fine-tuning of Large Language Models

Comprehensive survey of Federated Learning combined with LLM fine-tuning (FedLLM), covering privacy-preserving collaborative model adaptation methods.

Ax Yucheng Shi, Wenhao Yu, Jingyuan Huang, Wenlin Yao, Wenhu Chen, Ninghao Liu 2/25/2026

Towards Trustworthy GUI Agents: A Survey

Survey of trustworthy GUI agents built on LLMs, identifying execution gap challenges in real-world digital environment automation with irreversible actions.

Ax Jing Yu Lim, Rushi Shah, Zarif Ikram, Samson Yu, Haozhe Ma, Tze-Yun Leong, Dianbo Liu 2/25/2026

Performance Asymmetry in Model-Based Reinforcement Learning

Analysis of performance asymmetry in Model-Based RL agents on Atari100k, showing dramatic variance across task types despite high average performance.

Ax Zahra Shahrooei, Ali Baheri 2/25/2026

Wasserstein Barycenter Soft Actor-Critic

Wasserstein Barycenter Soft Actor-Critic algorithm improves sample efficiency in off-policy reinforcement learning via directed exploration.

Ax Andrey Goncharov, Daniil Vyazhev, Petr Sychev, Edvard Khalafyan, Alexey Zaytsev 2/25/2026

Complexity-aware fine-tuning

Efficient fine-tuning method for LLMs using entropy-based complexity detection to apply chain-of-thought reasoning selectively on difficult examples.

Ax Xuefeng Liu, Mingxuan Cao, Songhao Jiang, Xiao Luo, Xiaotian Duan, Mengdi Wang, Tobin R. Sosnick, Jinbo Xu, Rick Stevens 2/25/2026

Monte Carlo Tree Diffusion with Multiple Experts for Protein Design

MCTD-ME combines masked diffusion models with Monte Carlo Tree Search for protein design, addressing long-range dependencies and search space challenges.

Ax Jubayer Ibn Hamid, Ifdita Hasan Orney, Ellen Xu, Chelsea Finn, Dorsa Sadigh 2/25/2026

Polychromic Objectives for Reinforcement Learning

Polychromic objectives framework for reinforcement learning fine-tuning preserves policy diversity during RLFT to prevent mode collapse.

Ax Siddarth Venkatraman, Vineet Jain, Sarthak Mittal, Vedant Shah, Johan Obando-Ceron, Yoshua Bengio, Brian R. Bartoldson, Bhavya Kailkhura, Guillaume Lajoie, Glen Berseth, Nikolay Malkin, Moksh Jain 2/25/2026

Recursive Self-Aggregation Unlocks Deep Thinking in Large Language Models

Recursive Self-Aggregation (RSA) test-time scaling method combines parallel and sequential inference to improve LLM reasoning capabilities.

Ax Lizhang Chen, Jonathan Li, Kaizhao Liang, Baiyu Su, Cong Xie, Nuo Wang Pierse, Chen Liang, Ni Lao, Qiang Liu 2/25/2026

Cautious Weight Decay

Cautious Weight Decay (CWD) optimizer modification applies weight decay only to parameters aligned with optimizer updates.

Ax Dario Shariatian, Alain Durmus, Umut Simsekli, Stefano Peluchetti 2/25/2026

Latent-Augmented Discrete Diffusion Models

Latent-Augmented Discrete Diffusion (LADD) improves discrete diffusion models for fast language generation by modeling cross-token dependencies.

Ax Alexandra Volkova, Mher Safaryan, Christoph H. Lampert, Dan Alistarh 2/25/2026

Towards Robust Scaling Laws for Optimizers

Research on scaling laws for LLM pretraining with different optimizers beyond AdamW, examining new optimizers like Muon, Shampoo, and SOAP.

Ax Cl\'audio Correia, Alberto E. A. Ferreira, Lucas Martins, Miguel P. Bento, Sofia Guerreiro, Ricardo Ribeiro Pereira, Ana Sofia Gomes, Jacopo Bono, Hugo Ferreira, Pedro Bizarro 2/25/2026

MUSE: Multi-Tenant Model Serving With Seamless Model Updates

Multi-tenant ML serving system handling seamless model updates while maintaining decision thresholds across clients with distribution shifts.

Ax DatologyAI, :, Aldo Gael Carranza, Kaleigh Mentzer, Ricardo Pio Monti, Alex Fang, Alvin Deng, Amro Abbas, Anshuman Suri, Brett Larsen, Cody Blakeney, Darren Teh, David Schwab, Diego Kiner, Fan Pan, Haakon Mongstad, Haoli Yin, Jack Urbanek, Jason Lee, Jason Telanoff, Josh Wills, Luke Merrick, Maximilian B\"other, Parth Doshi, Paul Burstein, Pratyush Maini, Rishabh Adaiga, Sid Joshi, Spandan Das, Tony Jiang, Vineeth Dorna, Zhengping Wang, Bogdan Gaza, Ari Morcos, Matthew Leavitt 2/25/2026

\"UberWeb: Insights from Multilingual Curation for a 20-Trillion-Token Dataset

Study of multilingual data curation across 13 languages identifying interference patterns and optimal training strategies for 20-trillion-token dataset.

Ax GLM-5-Team, :, Aohan Zeng, Xin Lv, Zhenyu Hou, Zhengxiao Du, Qinkai Zheng, Bin Chen, Da Yin, Chendi Ge, Chenghua Huang, Chengxing Xie, Chenzheng Zhu, Congfeng Yin, Cunxiang Wang, Gengzheng Pan, Hao Zeng, Haoke Zhang, Haoran Wang, Huilong Chen, Jiajie Zhang, Jian Jiao, Jiaqi Guo, Jingsen Wang, Jingzhao Du, Jinzhu Wu, Kedong Wang, Lei Li, Lin Fan, Lucen Zhong, Mingdao Liu, Mingming Zhao, Pengfan Du, Qian Dong, Rui Lu, Shuang-Li, Shulin Cao, Song Liu, Ting Jiang, Xiaodong Chen, Xiaohan Zhang, Xuancheng Huang, Xuezhen Dong, Yabo Xu, Yao Wei, Yifan An, Yilin Niu, Yitong Zhu, Yuanhao Wen, Yukuo Cen, Yushi Bai, Zhongpei Qiao, Zihan Wang, Zikang Wang, Zilin Zhu, Ziqiang Liu, Zixuan Li, Bojie Wang, Bosi Wen, Can Huang, Changpeng Cai, Chao Yu, Chen Li, Chengwei Hu, Chenhui Zhang, Dan Zhang, Daoyan Lin, Dayong Yang, Di Wang, Ding Ai, Erle Zhu, Fangzhou Yi, Feiyu Chen, Guohong Wen, Hailong Sun, Haisha Zhao, Haiyi Hu, Hanchen Zhang, Hanrui Liu, Hanyu Zhang, Hao Peng, Hao Tai, Haobo Zhang, He Liu, Hongwei Wang, Hongxi Yan, Hongyu Ge, Huan Liu, Huanpeng Chu, Jia'ni Zhao, Jiachen Wang, Jiajing Zhao, Jiamin Ren, Jiapeng Wang, Jiaxin Zhang, Jiayi Gui, Jiayue Zhao, Jijie Li, Jing An, Jing Li, Jingwei Yuan, Jinhua Du, Jinxin Liu, Junkai Zhi, Junwen Duan, Kaiyue Zhou, Kangjian Wei, Ke Wang, Keyun Luo, Laiqiang Zhang, Leigang Sha, Liang Xu, Lindong Wu, Lintao Ding, Lu Chen, Minghao Li, Nianyi Lin, Pan Ta, Qiang Zou, Rongjun Song, Ruiqi Yang, Shangqing Tu, Shangtong Yang, Shaoxiang Wu, Shengyan Zhang, Shijie Li, Shuang Li, Shuyi Fan, Wei Qin, Wei Tian, Weining Zhang, Wenbo Yu, Wenjie Liang, Xiang Kuang, Xiangmeng Cheng, Xiangyang Li, Xiaoquan Yan, Xiaowei Hu, Xiaoying Ling, Xing Fan, Xingye Xia, Xinyuan Zhang, Xinze Zhang, Xirui Pan, Xu Zou, Xunkai Zhang, Yadi Liu, Yandong Wu, Yanfu Li, Yidong Wang, Yifan Zhu, Yijun Tan, Yilin Zhou, Yiming Pan, Ying Zhang, Yinpei Su, Yipeng Geng, Yong Yan, Yonglin Tan, Yuean Bi, Yuhan Shen, Yuhao Yang, Yujiang Li, Yunan Liu, Yunqing Wang, Yuntao Li, Yurong Wu, Yutao Zhang, Yuxi Duan, Yuxuan Zhang, Zezhen Liu, Zhengtao Jiang, Zhenhe Yan, Zheyu Zhang, Zhixiang Wei, Zhuo Chen, Zhuoer Feng, Zijun Yao, Ziwei Chai, Ziyuan Wang, Zuzhou Zhang, Bin Xu, Minlie Huang, Hongning Wang, Juanzi Li, Yuxiao Dong, Jie Tang 2/25/2026

GLM-5: from Vibe Coding to Agentic Engineering

GLM-5 foundation model transitioning from vibe coding to agentic engineering with DSA cost reduction and async RL infrastructure for improved autonomy.

Ax Ziliang Zhao, Bi Xue, Emma Lin, Mengjiao Zhou, Kaustubh Vartak, Shakhzod Ali-Zade, Tianqi Lu, Tao Li, Bin Kuang, Rui Jian, Bin Wen, Dennis van der Staay, Yixin Bao, Eddy Li, Chao Deng, Songbin Liu, Qifan Wang, Kai Ren 2/25/2026

Multi-Probe Zero Collision Hash (MPZCH): Mitigating Embedding Collisions and Enhancing Model Freshness in Large-Scale Recommenders

MPZCH indexing mechanism for large-scale recommendation systems to mitigate embedding collisions and improve model freshness in embedding tables.