Ax Koichi Tanaka, Ren Kishimoto, Bushun Kawagishi, Yusuke Narita, Yasuo Yamamoto, Nobuyuki Shimizu, Yuta Saito 3/20/2026

Off-Policy Learning with Limited Supply

Studies off-policy learning in contextual bandits with supply constraints for recommendation and advertising systems.

Ax Saaket Agashe, Jayanth Srinivasa, Gaowen Liu, Ramana Kompella, Xin Eric Wang 3/20/2026

Context Bootstrapped Reinforcement Learning

Method augmenting reinforcement learning from verifiable rewards with context bootstrapping to improve exploration and reasoning pattern acquisition.

Ax Edward Lin, Sahil Modi, Siva Kumar Sastry Hari, Qijing Huang, Zhifan Ye, Nestor Qin, Fengzhe Zhou, Yuan Zhang, Jingquan Wang, Sana Damani, Dheeraj Peri, Ouye Xie, Aditya Kane, Moshe Maor, Michael Behar, Triston Cao, Rishabh Mehta, Vartika Singh, Vikram Sharma Mailthody, Terry Chen, Zihao Ye, Hanfeng Chen, Tianqi Chen, Vinod Grover, Wei Chen, Wei Liu, Eric Chung, Luis Ceze, Roger Bringmann, Cyril Zeller, Michael Lightstone, Christos Kozyrakis, Humphrey Shi 3/20/2026

SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GPU Kernels Against Hardware Limits

SOL-ExecBench: 235-problem benchmark for CUDA kernel optimization against hardware efficiency limits for agentic AI systems.