Ax Christiaan Meijer, E. G. Patrick Bos 3/26/2026

Explainable embeddings with Distance Explainer

arXiv paper comparing linguistic features of AI-generated vs human responses to mental health queries. Application of LLMs in healthcare.

Ax Divyat Mahajan, Sachin Goyal, Badr Youbi Idrissi, Mohammad Pezeshki, Ioannis Mitliagkas, David Lopez-Paz, Kartik Ahuja 3/26/2026

Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries

Proposes future summary prediction as alternative to next-token prediction during LLM pretraining, improving long-horizon reasoning and planning capabilities.

Ax Sebasti\'an Andr\'es Cajas Ord\'o\~nez, Luis Fernando Torres Torres, Mackenzie J. Meni, Carlos Andr\'es Duran Paredes, Eric Arazo, Cristian Bosch, Ricardo Simon Carbajo, Yuan Lai, Leo Anthony Celi 3/26/2026

Uncertainty Makes It Stable: Curiosity-Driven Quantized Mixture-of-Experts

Curiosity-driven quantized Mixture-of-Experts framework using Bayesian uncertainty routing for accurate inference on resource-constrained devices.

Ax Natnael Mola, Leonardo S. B. Pereira, Carolina R. Kelsch, Luis H. Arribas, Juan C. S. M. Avedillo 3/26/2026

SPARE: Self-distillation for PARameter-Efficient Removal

Self-distillation approach for machine unlearning in text-to-image diffusion models. Balances effective forgetting with retention of unrelated concepts.

Ax Bjarni Haukur Bjarnason, Andr\'e Silva, Martin Monperrus 3/26/2026

On Randomness in Agentic Evals

Statistical analysis of variance in agentic system evaluations. Shows single-run pass@1 scores on SWE-Bench vary substantially (2.2-6.0%), calling for improved evaluation methodology.

Ax Lei Ma, Jinyang Liu, Tieying Zhang, Peter M. VanNostrand, Dennis M. Hofmann, Lei Cao, Elke A. Rundensteiner, Jianjun Chen 3/26/2026

KRONE: Hierarchical and Modular Log Anomaly Detection

Hierarchical framework for log anomaly detection that preserves component execution structure. Addresses spurious correlations in flat-sequence approaches.

Ax Egor Denisov, Svetlana Glazyrina, Maksim Kryzhanovskiy, Roman Ischenko 3/26/2026

Smooth Gate Functions for Soft Advantage Policy Optimization

Smooth gate functions for stabilizing GRPO LLM training. Replaces hard clipping with sigmoid-based gating to improve optimization stability in reasoning tasks.