Ax Marawan Gamal Abdel Hameed, Derek Tam, Pascal Jr Tikeng Notsawo, Colin Raffel, Guillaume Rabusseau 4/3/2026

Model Merging via Data-Free Covariance Estimation

Principled layer-wise optimization approach for model merging via data-free covariance estimation without task-specific training.

Ax Urs Hackstein, Jordi Alastruey, Philip Aston, Ciaran Bench, Peter H. Charlton, Loic Coquelin, Nando Hegemann, Vaidotas Marozas, Mohammad Moulaeifard, Manasi Nandi, Andrius Petrenas, Oskar Pfeffer, Mantas Rinkevicius, Andrius Solosenko, Nils Strodthoff, Sara Vardanega 4/3/2026

Benchmark Problems and Benchmark Datasets for the evaluation of Machine and Deep Learning methods on Photoplethysmography signals: the D4 report from the QUMPHY project

Benchmark datasets and evaluation protocols for machine learning methods on photoplethysmography medical signals.

Ax Nicholas Roberts, Sungjun Cho, Zhiqi Gao, Tzu-Heng Huang, Albert Wu, Gabriel Orlanski, Avi Trost, Kelly Buchanan, Aws Albarghouthi, Frederic Sala 4/3/2026

Test-Time Scaling Makes Overtraining Compute-Optimal

Train-to-Test scaling laws optimizing model size, training tokens, and inference samples jointly for compute-optimal LLM deployment.

Ax Barak Gahtan, Alex M. Bronstein 4/3/2026

Coupled Query-Key Dynamics for Attention

arXiv paper on coupled query-key dynamics for scaled dot-product attention. Improves language modeling perplexity by 6-7% on WikiText-103.

Ax Kang-Sin Choi 4/3/2026

Learn by Surprise, Commit by Proof

LSCP: Self-gated post-training framework for autonomous knowledge acquisition using self-generated Q&A chains and adaptive learning rates based on model conviction.