Ax Jay Chooi, Paul G\"olz, Ariel D. Procaccia, Benjamin Schiffer, Shirley Zhang 3/18/2026

Finding Common Ground in a Sea of Alternatives

Formal model for selecting statements that find common ground across diverse preferences using generative AI.

Ax Jian Yang, Wei Zhang, Jiajun Wu, Junhang Cheng, Shawn Guo, Haowen Wang, Weicheng Gu, Yaxin Du, Joseph Li, Fanglin Xu, Yizhi Li, Lin Jing, Yuanbo Wang, Yuhan Gao, Ruihao Gong, Chuan Hao, Ran Tao, Aishan Liu, Tuney Zheng, Ganqu Cui, Zhoujun Li, Mingjie Tang, Chenghua Lin, Wayne Xin Zhao, Xianglong Liu, Ming Zhou, Bryan Dai, Weifeng Lv 3/18/2026

InCoder-32B: Code Foundation Model for Industrial Scenarios

InCoder-32B, a 32B code foundation model optimized for industrial programming tasks with hardware semantics and resource constraints.

Ax Jun Saito, Jiefeng Li, Michael de Ruyter, Miguel Guerrero, Edy Lim, Ehsan Hassani, Roger Blanco Ribera, Hyejin Moon, Magdalena Dadela, Marco Di Lucca, Qiao Wang, Xueting Li, Jan Kautz, Simon Yuen, Umar Iqbal 3/18/2026

SOMA: Unifying Parametric Human Body Models

SOMA unified parametric body model bridging incompatibilities between SMPL, SMPL-X, and related human body representations.

Ax Kaixuan Wang, Tianxing Chen, Jiawei Liu, Honghao Su, Shaolong Zhu, Minxuan Wang, Zixuan Li, Yue Chen, Huan-ang Gao, Yusen Qin, Jiawei Wang, Qixuan Zhang, Lan Xu, Jingyi Yu, Yao Mu, Ping Luo 3/18/2026

ManiTwin: Scaling Data-Generation-Ready Digital Object Dataset to 100K

ManiTwin pipeline generates 100K simulation-ready 3D digital object twins from single images for robotic manipulation training.

Ax Ruisi Wang, Zhongang Cai, Fanyi Pu, Junxiang Xu, Wanqi Yin, Maijunxian Wang, Ran Ji, Chenyang Gu, Bo Li, Ziqi Huang, Hokin Deng, Dahua Lin, Ziwei Liu, Lei Yang 3/18/2026

Demystifing Video Reasoning

Study examining reasoning mechanisms in diffusion-based video models, challenging chain-of-frames assumptions about how reasoning emerges.

Ax Xiao Zhu, Chenmien Tan, Pinzhen Chen, Rico Sennrich, Huiming Wang, Yanlin Zhang, Hanxu Hu 3/18/2026

CHARM: Calibrating Reward Models With Chatbot Arena Scores

CHARM method calibrates reward models using Chatbot Arena scores to mitigate model preference bias and reward hacking in RLHF.

Ax Siru Ouyang, Jun Yan, I-Hung Hsu, Yanfei Chen, Ke Jiang, Zifeng Wang, Rujun Han, Long T. Le, Samira Daruki, Xiangru Tang, Vishy Tirumalashetty, George Lee, Mahsan Rofouei, Hangfei Lin, Jiawei Han, Chen-Yu Lee, Tomas Pfister 3/18/2026

ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory

ReasoningBank memory framework enables LLM agents to learn from interaction history and distill generalizable reasoning strategies for continuous tasks.

Ax Sumanth Varambally, Marshall Fisher, Jas Thakker, Yiwei Chen, Zhirui Xia, Yasaman Jafari, Ruijia Niu, Manas Jain, Veeramakali Vignesh Manivannan, Zachary Novack, Luyu Han, Srikar Eranky, Salva R\"uhling Cachay, Taylor Berg-Kirkpatrick, Duncan Watson-Parris, Yi-An Ma, Rose Yu 3/18/2026

Zephyrus: An Agentic Framework for Weather Science

Zephyrus agentic framework combines weather foundation models with LLM reasoning for interactive scientific workflows in meteorology.

Ax Dachuan Lin, Guobin Shen, Zihao Yang, Tianrong Liu, Dongcheng Zhao, Yi Zeng 3/18/2026

Efficient LLM Safety Evaluation through Multi-Agent Debate

Multi-agent debate framework using small language models for cost-efficient LLM safety evaluation, with HAJailBench benchmark for jailbreak testing.

Ax Sunghyun Wee, Suyoung Kim, Hyeonjin Kim, Kyomin Hwang, Nojun Kwak 3/18/2026

Alignment-Aware Quantization for LLM Safety

Alignment-Aware Quantization: PTQ method for efficient LLM deployment that preserves behavioral alignment and safety properties, not just minimizing reconstruction error.

Ax Nuoya Xiong, Yuhang Zhou, Hanqing Zeng, Zhaorun Chen, Furong Huang, Shuchao Bi, Lizhu Zhang, Zhuokai Zhao 3/18/2026

Token-Level LLM Collaboration via FusionRoute

FusionRoute: token-level collaboration method enabling multiple specialized LLMs to work together, combining domain expertise efficiency with generalization.

Ax Ruoran Li, Xinghua Zhang, Haiyang Yu, Shitong Duan, Xiang Li, Wenxin Xiang, Chonghua Liao, Xudong Guo, Yongbin Li, Jinli Suo 3/18/2026

MemPO: Self-Memory Policy Optimization for Long-Horizon Agents

MemPO: self-memory policy optimization approach enabling long-horizon agents to proactively manage memory content aligned with task objectives.

Ax Linus Folkerts, Will Payne, Simon Inman, Philippos Giavridis, Joe Skinner, Sam Deverett, James Aung, Ekin Zorer, Michael Schmatz, Mahmoud Ghanem, John Wilkinson, Alan Steer, Vy Hong, Jessica Wang 3/18/2026

Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios

Evaluation of frontier AI models' autonomous cyber-attack capabilities on multi-step scenarios, tracking capability trends across 18 months of model releases.