Ax Guoliang Zhao, Ruobing Xie, An Wang, Shuaipeng Li, Huaibing Xie, Xingwu Sun 3/26/2026

Self-Distillation for Multi-Token Prediction

Self-distillation method for multi-token prediction in LLMs to improve inference efficiency and MTP head acceptance rates.

Ax Hongjie Chen, Hanyu Meng, Huimin Zeng, Ryan A. Rossi, Lie Lu, Josh Kimball 3/26/2026

Variable-Length Audio Fingerprinting

Variable-length audio fingerprinting method using deep learning for robust recognition of distorted recordings.

Ax Allen Nie, Xavier Daull, Zhiyi Kuang, Abhinav Akkiraju, Anish Chaudhuri, Max Piasevoli, Ryan Rong, YuCheng Yuan, Prerit Choudhary, Shannon Xiao, Rasool Fakoor, Adith Swaminathan, Ching-An Cheng 3/26/2026

Understanding the Challenges in Iterative Generative Optimization with LLMs

Analysis of challenges in iterative generative optimization with LLMs for self-improving agents, identifying hidden design choices limiting adoption.

Ax Jingzhi Fang, Xiong Gao, Renwei Zhang, Zichun Ye, Lei Chen, Jie Zhao, Chengnuo Huang, Hui Xu, Xuefeng Jin 3/26/2026

DVM: Real-Time Kernel Generation for Dynamic AI Models

DVM runtime kernel generation system for efficient compilation of dynamic AI models with variable tensor shapes and control flows.