Ax Jubayer Ibn Hamid, Ifdita Hasan Orney, Ellen Xu, Chelsea Finn, Dorsa Sadigh 4/2/2026

Polychromic Objectives for Reinforcement Learning

Addresses mode collapse in reinforcement learning fine-tuning by introducing polychromic objectives that preserve policy diversity and enable better exploration.

Ax Asad Aali, Muhammad Ahmed Mohsin, Vasiliki Bikia, Arnav Singhvi, Richard Gaus, Suhana Bedi, Hejie Cui, Miguel Fuentes, Alyssa Unell, Yifan Mai, Jordan Cahoon, Michael Pfeffer, Roxana Daneshjou, Sanmi Koyejo, Emily Alsentzer, Christopher Potts, Nigam H. Shah, Akshay S. Chaudhari 4/2/2026

Structured Prompts Improve Evaluation of Language Models

Study showing structured prompts significantly improve LLM evaluation accuracy and reduce prompt-dependent variance in benchmark frameworks like HELM.

Ax Isha Chaudhary, Vedaant Jain, Prineet Parhar, Kavya Sachdeva, Avaljot Singh, Sayan Ranu, Gagandeep Singh 4/2/2026

Lumos: Let there be Language Model System Certification

Lumos framework for formally certifying language model system behaviors using imperative probabilistic programming with graph-based prompt generation.

Ax Eason Chen, Sophia Judicke, Kayla Beigh, Xinyi Tang, Isabel Wang, Nina Yuan, Zimo Xiao, Chuangji Li, Shizhuo Li, Reed Luttmer, Shreya Singh, Maria Yampolsky, Naman Parikh, Yvonne Zhao, Meiyi Chen, Scarlett Huang, Anishka Mohanty, Gregory Johnson, John Mackey, Jionghao Lin, Ken Koedinger 4/2/2026

Chat-Based Support Alone May Not Be Enough: Comparing Conversational and Embedded LLM Feedback for Mathematical Proof Learning

Empirical evaluation of GPTutor LLM tutoring system comparing embedded proof-review feedback versus chatbot support for discrete mathematics learning.

Ax Ruiying Li, Yunlang Zhou, YuYao Zhu, Kylin Chen, Jingyuan Wang, Sukai Wang, Kongtao Hu, Minhui Yu, Bowen Jiang, Zhan Su, Jiayao Ma, Xin He, Yongjian Shen, Yang Yang, Guanghui Ren, Maoqing Yao, Wenhao Wang, Yao Mu 4/2/2026

RoboClaw: An Agentic Framework for Scalable Long-Horizon Robotic Tasks

RoboClaw agentic framework unifying data collection, policy learning, and deployment for long-horizon robotic tasks with vision-language-action systems.

Ax Haoyang Fang, Shuai Zhang, Yifei Ma, Hengyi Wang, Cuixiong Hu, Katrin Kirchhoff, Bernie Wang, George Karypis 4/2/2026

OPERA: Online Data Pruning for Efficient Retrieval Model Adaptation

OPERA framework for data pruning in dense retrieval models that improves both efficiency and effectiveness of domain-specific finetuning through heterogeneous pair selection.

Ax Devashish Gaikwad, Wil M. P. van der Aalst, Gyunam Park 4/2/2026

Neuro-Symbolic Process Anomaly Detection

Neuro-symbolic approach combining neural networks with domain knowledge for process anomaly detection in event logs.

Ax Yufei Xu, Fanxu Meng, Fan Jiang, Yuxuan Wang, Ruijie Zhou, Zhaohui Wang, Jiexi Wu, Zhixin Pan, Xiaojuan Tang, Wenjie Pei, Tongxuan Liu, Di yin, Xing Sun, Muhan Zhang 4/2/2026

HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention

HISA improves efficiency of sparse attention mechanisms by optimizing hierarchical indexing to reduce bottlenecks in token-level key selection for LLMs.

Ax Adrian Mart\'inez, Ananya Gupta, Hanka Goralija, Mario Rico, Sa\'ul Fenollosa, Tamar Alphaidze 4/2/2026

Evolution Strategies for Deep RL pretraining

Evolution strategies for Deep RL pretraining offering derivative-free, computationally efficient alternative to standard deep reinforcement learning.

Ax Leonardo Medrano Sandonas, David Balcells, Anton Bochkarev, Jacqueline M. Cole, Volker L. Deringer, Werner Dobrautz, Adrian Ehrenhofer, Thorben Frank, Pascal Friederich, Rico Friedrich, Janine George, Luca Ghiringhelli, Alejandra Hinostroza Caldas, Veronika Juraskova, Hannes Kneiding, Yury Lysogorskiy, Johannes T. Margraf, Hanna T\"urk, Anatole von Lilienfeld, Milica Todorovi\'c, Alexandre Tkatchenko, Mariana Rossi, Gianaurelio Cuniberti 4/2/2026

Perspective: Towards sustainable exploration of chemical spaces with machine learning

Perspective on sustainability challenges in AI-driven molecular and materials discovery across QM data, training, and automation pipelines.