Ax Shuofei Qiao, Yanqiu Zhao, Zhisong Qiu, Xiaobin Wang, Jintian Zhang, Zhao Bin, Ningyu Zhang, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen 3/2/2026

Scaling Generalist Data-Analytic Agents

DataMind framework trains scalable generalist data-analytic agents on diverse data formats for multi-step reasoning in scientific discovery.

Ax Constanza Fierro, Fabien Roger 3/2/2026

Steering Language Models with Weight Arithmetic

Contrastive weight steering method edits LLM parameters via weight arithmetic for post-training behavior modification without expensive retraining.

Ax Yu-Chao Hsu, Jiun-Cheng Jiang, Chun-Hua Lin, Kuo-Chung Peng, Nan-Yow Chen, Samuel Yen-Chi Chen, En-Jui Kuo, Hsi-Sheng Goan 3/2/2026

QKAN-LSTM: Quantum-inspired Kolmogorov-Arnold Long Short-term Memory

QKAN-LSTM combines quantum-inspired methods with Kolmogorov-Arnold networks for improved sequential modeling with reduced parameter redundancy.

Ax Azmine Toushik Wasi, Wahid Faisal, Abdur Rahman, Mahfuz Ahmed Anik, Munem Shahriar, Mohsin Mahmud Topu, Sadia Tasnim Meem, Rahatun Nesa Priti, Sabrina Afroz Mitu, Md. Iqramul Hoque, Shahriyar Zaman Ridoy, Mohammed Eunus Ali, Majd Hawasly, Mohammad Raza, Md Rizwan Parvez 3/2/2026

SpatiaLab: Can Vision-Language Models Perform Spatial Reasoning in the Wild?

SpatiaLab benchmark evaluates vision-language models' spatial reasoning capabilities on real-world tasks with visual noise and diverse spatial relationships.

Ax Tony Feng, Junehyuk Jung, Sang-hyun Kim, Carlo Pagano, Sergei Gukov, Chiang-Chiang Tsai, David Woodruff, Adel Javanmard, Aryan Mokhtari, Dawsen Hwang, Yuri Chervonyi, Jonathan N. Lee, Garrett Bingham, Trieu H. Trinh, Vahab Mirrokni, Quoc V. Le, Thang Luong 3/2/2026

Aletheia tackles FirstProof autonomously

Janus-Q uses event-driven hierarchical reward modeling with textual signals for financial market trading.

Ax Sumin Kim, Jihoon Kwon, Yoon Kim, Nicole Kagan, Raffi Khatchadourian, Wonbin Ahn, Alejandro Lopez-Lira, Jaewon Lee, Yoontae Hwang, Oscar Levy, Yongjae Lee, Chanyeol Choi 3/2/2026

Forecasting Future Language: Context Design for Mention Markets

Aletheia, Gemini 3 Deep Think-powered math research agent, autonomously solved 6 of 10 FirstProof challenge problems.

Ax Willem Schooltink, Fabio Massimo Zennaro 3/2/2026

Multi-Level Causal Embeddings

Research on context design for LLM-based probabilistic forecasting in mention prediction markets.

Ax Zezhou Wang, Youjie Li, Zhiqi Lin, Jiacheng Yang, Cong Xie, Guanyu Feng, Zheng Zhong, Ziyue Huang, Hongyu Zhu, Zhi Zhang, Yanghua Peng, Xin Liu 3/2/2026

veScale-FSDP: Flexible and High-Performance FSDP at Scale

Framework for multi-level causal embeddings enabling mapping of detailed models into coarser causal model sub-systems.

Ax Bangrui Xu, Qihang Yao, Zirui Tang, Xuanhe Zhou, Yeye He, Shihan Yu, Qianqian Xu, Bin Wang, Guoliang Li, Conghui He, Fan Wu 3/2/2026

MoDora: Tree-Based Semi-Structured Document Analysis System

veScale-FSDP improves fully sharded data parallel training flexibility for structure-aware methods and advanced optimizers.

HN schmuhblaster 3/2/2026

Static taint analysis for LLM agents

DeepClause: Tool for compiling Markdown into DML (Prolog-based language) for orchestrating LLM agents securely. Combines DSPy, CodeAct, Prolog with static taint analysis.