Ax Hengjie Cao, Mengyi Chen, Yifeng Yang, Fang Dong, Ruijun Huang, Anrui Chen, Jixian Zhou, Mingzhi Dong, Yujiang Wang, Dongsheng Li, Wenyi Fang, Yuanyi Lin, Fan Wu, Li Shang 2/16/2026

Dispelling the Curse of Singularities in Neural Network Optimizations

Analysis of optimization instability in deep neural networks caused by parametric singularities, proposing gradient Frobenius norm solutions.

Ax Sunil Madhow, Yuchen Liang, Ness Shroff, Yingbin Liang, Yu-Xiang Wang 2/16/2026

Learnable Chernoff Baselines for Inference-Time Alignment

Learnable Chernoff Baselines enable efficient inference-time reward-guided alignment for generative models without architecture modifications or computational overhead.

Ax Yutao Zhu, Xingshuo Zhang, Maosen Zhang, Jiajie Jin, Liancheng Zhang, Xiaoshuai Song, Kangzhi Zhao, Wencong Zeng, Ruiming Tang, Han Li, Ji-Rong Wen, Zhicheng Dou 2/16/2026

GISA: A Benchmark for General Information-Seeking Assistant

GISA benchmark evaluates information-seeking AI agents performing multi-turn web interactions, addressing limitations of existing benchmarks with naturally-constructed real-world tasks.

Ax Tiwei Bie, Maosong Cao, Xiang Cao, Bingsen Chen, Fuyuan Chen, Kun Chen, Lun Du, Daozhuo Feng, Haibo Feng, Mingliang Gong, Zhuocheng Gong, Yanmei Gu, Jian Guan, Kaiyuan Guan, Hongliang He, Zenan Huang, Juyong Jiang, Zhonghui Jiang, Zhenzhong Lan, Chengxi Li, Jianguo Li, Zehuan Li, Huabin Liu, Lin Liu, Guoshan Lu, Yuan Lu, Yuxin Ma, Xingyu Mou, Zhenxuan Pan, Kaida Qiu, Yuji Ren, Jianfeng Tan, Yiding Tian, Zian Wang, Lanning Wei, Tao Wu, Yipeng Xing, Wentao Ye, Liangyu Zha, Tianze Zhang, Xiaolu Zhang, Junbo Zhao, Da Zheng, Hao Zhong, Wanli Zhong, Jun Zhou, Junlin Zhou, Liwang Zhu, Muzhi Zhu, Yihong Zhuang 2/16/2026

LLaDA2.1: Speeding Up Text Diffusion via Token Editing

LLaDA2.1 improves text diffusion model decoding speed by combining Token-to-Token and Mask-to-Token editing schemes for 100B-parameter block-diffusion models.

Ax Scienta Team, Ethan Bandasack, Vincent Bouget, Apolline Bruley, Yannis Cattan, Charlotte Claye, Matthew Corney, Julien Duquesne, Karim El Kanbi, Aziz Fouch\'e, Pierre Marschall, Francesco Strozzi 2/16/2026

EVA: Towards a universal model of the immune system

Foundation model for immune system combining multimodal patient-level data to capture multicellular interactions in immune diseases.

Ax Yuanyong Luo, Jing Huang, Yu Cheng, Ziwei Yu, Kaihua Tang, Xinda Ma, Xin Wang, Anping Tong, Guipeng Hu, Yun Xu, Mehran Taghian, Peng Wu, Guanglin Li, Yunke Peng, Tianchi Hu, Minqi Chen, Michael Bi Mi, Hu Liu, Xiping Zhou, Junsong Wang, Qiang Lin, Heng Liao 2/16/2026

HiFloat4 Format for Language Model Inference

Block floating-point data format for efficient LLM inference achieving 4.5 bits per value with hierarchical scaling.

Ax Hao Qin, Yukai Sun, Meng Wang, Ming Kong, Mengxu Lu, Qiang Zhu 2/16/2026

Variation-aware Flexible 3D Gaussian Editing

Method for editing 3D Gaussian Splatting representations with improved cross-view consistency and editing flexibility.

Ax Dianyi Wang, Ruihang Li, Feng Han, Chaofan Ma, Wei Song, Siyuan Wang, Yibin Wang, Yi Xin, Hongjian Liu, Zhixiong Zhang, Shengyuan Ding, Tianhang Wang, Zhenglin Cheng, Tao Lin, Cheng Jin, Kaicheng Yu, Jingjing Chen, Wenjie Wang, Zhongyu Wei, Jiaqi Wang 2/16/2026

DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing

Lightweight 5B parameter multimodal model for image generation and editing competitive with much larger models.

Ax Nate Rahn, Allison Qi, Avery Griffin, Jonathan Michala, Henry Sleight, Erik Jones 2/16/2026

Abstractive Red-Teaming of Language Model Character

Method for identifying queries that cause LLM character specification violations using red-teaming approaches to detect deployment-level failures efficiently.

Ax Gianfranco Cort\'es, Maria Esteban-Casadevall, Yueqing Feng, Jonas Henkel, Edward Hirst, Tancredi Schettini Gherardini, Alexander G. Stapleton 2/16/2026

A Machine Learning Approach to the Nirenberg Problem

Physics-informed neural network solving the Nirenberg differential geometry problem of prescribing Gaussian curvature on surfaces using mesh-free approach.

Ax Raiz Ud Din (Artificial Intelligence Lab, Department of Computer Systems Engineering, University of Engineering and Applied Sciences), Saddam Hussain Khan (Interdisciplinary Research Center for Smart Mobility and Logistics, King Fahad University of Petroleum and Minerals) 2/16/2026

TFT-ACB-XML: Decision-Level Integration of Customized Temporal Fusion Transformer and Attention-BiLSTM with XGBoost Meta-Learner for BTC Price Forecasting

Hybrid machine learning ensemble framework combining temporal fusion transformer, attention-BiLSTM, and XGBoost for Bitcoin price forecasting.

Ax Maosen Tang, Alex Townsend 2/16/2026

Rational Neural Networks have Expressivity Advantages

Theoretical analysis showing neural networks with trainable low-degree rational activation functions are more expressive and parameter-efficient than standard activations.

Ax Ari Spiesberger, Juan J. Vazquez, Nicky Pochinkov, Tom\'a\v{s} Gaven\v{c}iak, Peli Grietzer, Gavin Leech, Nandi Schoots 2/16/2026

Soft Contamination Means Benchmarks Test Shallow Generalization

Studies soft contamination in LLM training data through semantic duplicates, showing typical decontamination filters fail to detect near-equivalent benchmark test data.

Ax Paul Janson, Edouard Oyallon, Eugene Belilovsky 2/16/2026

Stabilizing Native Low-Rank LLM Pretraining

Demonstrates stable training of LLMs from scratch using exclusively low-rank weight factorization, matching dense model performance while reducing computational costs.

Ax Jinwoo Kim, Taylor Berg-Kirkpatrick, Loris D'Antoni 2/16/2026

Continuous Diffusion Models Can Obey Formal Syntax

Training-free guidance method enabling continuous diffusion language models to satisfy formal syntactic constraints like JSON schema matching via regular expressions.

Ax Noor Islam S. Mohammad, Md Muntaqim Meherab 2/16/2026

Regularized Meta-Learning for Improved Generalization

Regularized meta-learning framework addressing redundancy and overfitting in deep ensemble methods through redundancy-aware projection and statistical weighting.

Ax Milan Gautam, Ning Dai, Tianshuo Zhou, Bowen Xie, David Mathews, Liang Huang 2/16/2026

Designing RNAs with Language Models

RNA sequence design reframed as conditional sequence generation task using language models instead of traditional optimization approaches.