Ax Yifei Zhang, Xu Yang, Xiao Yang, Bowen Xian, Qizheng Li, Shikai Fang, Jingyuan Li, Jian Wang, Mingrui Xu, Weiqing Liu, Jiang Bian 3/11/2026

Reasoning as Gradient: Scaling MLE Agents Beyond Tree Search

GOME: MLE agent framework replacing tree search with gradient-based optimization for machine learning engineering tasks using LLM reasoning.

Ax Siddharth Boppana, Annabel Ma, Max Loeffler, Raphael Sarfati, Eric Bigelow, Atticus Geiger, Owen Lewis, Jack Merullo 3/11/2026

Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

Analysis of performative chain-of-thought in reasoning models, showing hidden beliefs diverge from generated reasoning tokens at task-specific difficulty levels.

Ax Peter Brodeur, Jacob M. Koshy, Anil Palepu, Khaled Saab, Ava Homiar, Roma Ruparel, Charles Wu, Ryutaro Tanno, Joseph Xu, Amy Wang, David Stutz, Hannah M. Ferrera, David Barrett, Lindsey Crowley, Jihyeon Lee, Spencer E. Rittner, Ellery Wulczyn, Selena K. Zhang, Elahe Vedadi, Christine G. Kohn, Kavita Kulkarni, Vinay Kadiyala, Sara Mahdavi, Wendy Du, Jessica Williams, David Feinbloom, Renee Wong, Tao Tu, Petar Sirkovic, Alessio Orlandi, Christopher Semturs, Yun Liu, Juraj Gottweis, Dale R. Webster, Jo\"elle Barral, Katherine Chou, Pushmeet Kohli, Avinatan Hassidim, Yossi Matias, James Manyika, Rob Fields, Jonathan X. Li, Marc L. Cohen, Vivek Natarajan, Mike Schaekermann, Alan Karthikesalingam, Adam Rodman 3/11/2026

A prospective clinical feasibility study of a conversational diagnostic AI in an ambulatory primary care clinic

Clinical feasibility study of LLM-based conversational AI (AMIE) for patient diagnostic history-taking in real-world primary care workflows with safety assessment.

Ax Ben Rank, Hardik Bhatnagar, Ameya Prabhu, Shira Eisenberg, Karina Nguyen, Matthias Bethge, Maksym Andriushchenko 3/11/2026

PostTrainBench: Can LLM Agents Automate LLM Post-Training?

PostTrainBench benchmark evaluating whether LLM agents can automate post-training of base LLMs into assistants.

Ax Panayiotis Raptis, Fatih Aslan, George Iosifidis 3/11/2026

Equitable Multi-Task Learning for AI-RANs

OWO-FMTL framework for fair multi-task learning in AI-enabled radio access networks with equitable user performance.

Ax Hui-Ze Tan, Xiao-Wen Yang, Hao Chen, Jie-Jing Shao, Yi Wen, Yuteng Shen, Weihong Luo, Xiku Du, Lan-Zhe Guo, Yu-Feng Li 3/11/2026

Hindsight Credit Assignment for Long-Horizon LLM Agents

HCAPO framework addressing credit assignment in long-horizon LLM agent tasks using hindsight and value baseline alignment.

Ax Michael Leznik 3/11/2026

The Temporal Markov Transition Field

Temporal Markov Transition Field extension handling non-stationary time series by tracking regime changes instead of using global transition matrix.

Ax Chloe H. Su, Zhe Ye, Samuel Tenka, Aidan Yang, Soonho Kong, Udaya Ghai 3/11/2026

Learning Adaptive LLM Decoding

Learning adaptive decoding policies for LLMs that dynamically select sampling strategies based on prompt difficulty and compute.

Ax Shuangfei Zhai 3/11/2026

Exclusive Self Attention

Exclusive self-attention mechanism constraining attention to orthogonal information for improved Transformer language modeling.