Ax Yuxi Liu, Konpat Preechakul, Kananart Kuwaranancharoen, Yutong Bai 2/17/2026

The Serial Scaling Hypothesis

Theoretical framework distinguishing inherently sequential problems that cannot be efficiently parallelized, relevant to LLM reasoning.

Ax Kaiwen Zheng, Huayu Chen, Haotian Ye, Haoxiang Wang, Qinsheng Zhang, Kai Jiang, Hang Su, Stefano Ermon, Jun Zhu, Ming-Yu Liu 2/17/2026

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

Online reinforcement learning method for diffusion models addressing intractable likelihoods, enabling RLHF-style training without solver restrictions.

Ax Anton Korznikov, Andrey Galichin, Alexey Dontsov, Oleg Y. Rogov, Ivan Oseledets, Elena Tutubalina 2/17/2026

The Rogue Scalpel: Activation Steering Compromises LLM Safety

Demonstrates that activation steering for LLM control can compromise safety mechanisms, causing models to comply with harmful requests.

Ax Aman Gupta, Rafael Celente, Abhishek Shivanna, D. T. Braithwaite, Gregory Dexter, Shao Tang, Hiroto Udagawa, Daniel Silva, Rohan Ramanath, S. Sathiya Keerthi 2/17/2026

Effective Quantization of Muon Optimizer States

8-bit quantization technique for Muon optimizer states in LLM pre-training, reducing memory overhead while maintaining training efficiency.

Ax Zhaomin Wu, Haodong Zhao, Ziyang Wang, Jizhou Guo, Qian Wang, Bingsheng He 2/17/2026

LLM DNA: Tracing Model Evolution via Functional Representations

Method to trace evolutionary relationships between LLMs through functional representations, enabling better model management and understanding of fine-tuning/distillation lineages.

Ax Patrick Langer, Thomas Kaar, Max Rosenblattl, Maxwell A. Xu, Winnie Chow, Martin Maritsch, Robert Jakob, Ning Wang, Juncheng Liu, Aradhana Verma, Brian Han, Daniel Seung Kim, Henry Chubb, Scott Ceresnak, Aydin Zahedivash, Alexander Tarlochan Singh Sandhu, Fatima Rodriguez, Daniel McDuff, Elgar Fleisch, Oliver Aalami, Filipe Barata, Paul Schmiedmayer 2/17/2026

OpenTSLM: Time-Series Language Models for Reasoning over Multivariate Medical Text- and Time-Series Data

OpenTSLM: time-series language models integrating multivariate medical time-series as native modality. Enables LLMs to handle temporal clinical data.

Ax Yukun Zhang, Xueqing Zhou 2/17/2026

Where to Add PDE Diffusion in Transformers

Research on optimal placement of PDE diffusion layers in transformer architectures using heat equation-based smoothing for local geometric priors.