Ax Huaiyang Wang, Xiaojie Li, Deqing Wang, Haoyi Zhou, Zixuan Huang, Yaodong Yang, Jianxin Li, Yikun Ban 4/2/2026

Policy Improvement Reinforcement Learning

Reinforcement learning approach with verification for iteratively improving LLM policies based on actual performance gains.

Ax Ken M. Nakanishi 4/2/2026

Screening Is Enough

Multiscreen mechanism for language models enabling absolute query-key relevance assessment beyond relative attention redistribution.

Ax Shikhar Bharadwaj, Chin-Jou Li, Kwanghee Choi, Eunjung Yeo, William Chen, Shinji Watanabe, David R. Mortensen 4/2/2026

An Empirical Recipe for Universal Phone Recognition

PhoneticXEUS model for robust multilingual phone recognition trained on large-scale data with pretrained representations.

Ax Silong Yong, Stephen Sheng, Carl Qi, Xiaojie Wang, Evan Sheehan, Anurag Shivaprasad, Yaqi Xie, Katia Sycara, Yesh Dattatreya 4/2/2026

Generalizable Dense Reward for Long-Horizon Robotic Tasks

Framework combining vision language models with RL for dense reward generation in long-horizon robotic tasks to reduce manual reward engineering.

Ax Lei Huang, Chuan Qiu, Kuan-Jui Su, Anqi Liu, Yun Gong, Weiqiang Lin, Lindong Jiang, Chen Zhao, Meng Song, Jeffrey Deng, Qing Tian, Zhe Luo, Ping Gong, Hui Shen, Chaoyang Zhang, Hong-Wen Deng 4/2/2026

GenoBERT: A Language Model for Accurate Genotype Imputation

GenoBERT uses transformers for reference-free genotype imputation without ancestry bias.