Ax Nicolas Menet, Aleksandar Terzi\'c, Michael Hersche, Andreas Krause, Abbas Rahimi 3/2/2026

Thompson Sampling via Fine-Tuning of LLMs

Bayesian optimization method using LLM fine-tuning to perform Thompson sampling in large discrete spaces without gradient computation.

Ax Jack Hong, Chenxiao Zhao, ChengLin Zhu, Weiheng Lu, Guohai Xu, Xing Yu 3/2/2026

DeepEyesV2: Toward Agentic Multimodal Model

Agentic multimodal model framework enabling tool invocation (code execution, web search) and reasoning integration for vision-language tasks.

Ax Ziyi Chen, Yingnan Guo, Zedong Chu, Minghua Luo, Yanfen Shen, Mingchao Sun, Junjun Hu, Shichao Xie, Kuan Yang, Pei Shi, Zhining Gu, Lu Liu, Honglin Han, Xiaolong Wu, Mu Xu, Yu Zhang, Ning Guo 3/2/2026

SocialNav: Training Human-Inspired Foundation Model for Socially-Aware Embodied Navigation

SocialNav foundation model for socially-aware embodied navigation with hierarchical architecture trained on 7M samples for human-compliant trajectory generation.

Ax Yu-Chao Hsu, Jiun-Cheng Jiang, Chun-Hua Lin, Kuo-Chung Peng, Nan-Yow Chen, Samuel Yen-Chi Chen, En-Jui Kuo, Hsi-Sheng Goan 3/2/2026

QKAN-LSTM: Quantum-inspired Kolmogorov-Arnold Long Short-term Memory

QKAN-LSTM combining quantum-inspired Kolmogorov-Arnold networks with LSTM for improved sequential modeling with reduced parameter redundancy.

Ax Li Ju, Jun Zhao, Mingxu Chai, Ziyu Shen, Xiangyang Wang, Yage Geng, Chunchun Ma, Hao Peng, Guangbin Li, Tao Li, Chengyong Liao, Fu Wang, Xiaolong Wang, Junshen Chen, Rui Gong, Shijia Liang, Feiyan Li, Ming Zhang, Kexin Tan, Junjie Ye, Zhiheng Xi, Shihan Dou, Tao Gui, Yuankai Ying, Yang Shi, Yue Zhang, Qi Zhang 3/2/2026

WisPaper: Your AI Scholar Search Engine

WisPaper end-to-end agent system for academic literature discovery and organization combining semantic search verification with workflow integration.

Ax Jiyoon Pyo, Yuankun Jiao, Dongwon Jung, Zekun Li, Leeje Jang, Sofia Kirsanova, Jina Kim, Yijun Lin, Qin Liu, Junyi Xie, Hadi Askari, Nan Xu, Muhao Chen, Yao-Yi Chiang 3/2/2026

FRIEDA: Benchmarking Multi-Step Cartographic Reasoning in Vision-Language Models

FRIEDA benchmark evaluating vision-language models on multi-step cartographic reasoning with map interpretation for disaster response and urban planning.

Ax Bart{\l}omiej Starosta, S{\l}awomir T. Wierzcho\'n, Piotr Borkowski, Dariusz Czerski, Marcin Sydow, Eryk Laskowski, Mieczys{\l}aw A. K{\l}opotek 3/2/2026

Rough Sets for Explainability of Spectral Graph Clustering

Rough sets methodology for explaining spectral graph clustering results on text documents with handling of documents without clear content meaning.

Ax Aaron Defazio, Konstantin Mishchenko, Parameswaran Raman, Hao-Jun Michael Shi, Lin Xiao 3/2/2026

Smoothing DiLoCo with Primal Averaging for Faster Training of LLMs

Generalized Primal Averaging optimizer extending Nesterov's method for faster LLM training, unifying DiLoCo and schedule-free approaches with reduced memory requirements.

Ax Yingru Li, Jiacai Liu, Jiawei Xu, Yuxuan Tong, Ziniu Li, Qian Liu, Baoxiang Wang 3/2/2026

Trust Region Masking for Long-Horizon LLM Reinforcement Learning

Trust region masking technique for LLM reinforcement learning addressing off-policy mismatch and approximation errors from implementation divergences in policy gradient optimization.

Ax Iv\'an Arcuschin, David Chanin, Adri\`a Garriga-Alonso, Oana-Maria Camburu 3/2/2026

Biases in the Blind Spot: Detecting What LLMs Fail to Mention

Automated black-box pipeline detecting unverbalized biases in LLM chain-of-thought reasoning without predefined categories using task-specific evaluation.

Ax Daniel Romero-Alvarado, Fernando Mart\'inez-Plumed, Lorenzo Pacchiardi, Hugo Save, Siddhesh Milind Pawar, Behzad Mehrbakhsh, Pablo Antonio Moreno Casares, Ben Slater, Paolo Bova, Peter Romero, Zachary R. Tyler, Jonathan Prunty, Luning Sun, Jose Hernandez-Orallo 3/2/2026

Capabilities Ain't All You Need: Measuring Propensities in AI

Framework extending Item Response Theory to measure AI model propensities and behavioral tendencies beyond capability metrics.