Isolater - Feed

Ax Ruta Binkyte, Ivaxi Sheth, Zhijing Jin, Mohammad Havaei, Bernhard Sch\"olkopf, Mario Fritz 3/16/2026

Causality Is Key to Understand and Balance Multiple Goals in Trustworthy ML and Foundation Models

Advocates integrating causal methods into ML to balance trustworthiness objectives like fairness, privacy, robustness, and explainability.

Ax Heng-Sheng Chang, Prashant G. Mehta 3/16/2026

Dual Filter: A Transformer-like Inference Architecture for Hidden Markov Models

Proposes Dual Filter framework connecting Hidden Markov Models to transformer decoder architecture for causal nonlinear prediction.

Ax Mariana A. Fazio, Manel Martinez-Ramon, Salvador Sosa G\"uitron, Marcus Babzien, Mikhail Fedurin, Junjie Li, Mark Palmer, Sandra S. Biedron 3/16/2026

Unsupervised anomaly detection in MeV ultrafast electron diffraction

Applies unsupervised anomaly detection to ultrafast electron diffraction data to identify beam instabilities in materials science experiments.

Ax Yueheng Li, Guangming Xie, Zongqing Lu 3/16/2026

Guided Policy Optimization under Partial Observability

Introduces Guided Policy Optimization framework for RL in partially observable environments using privileged information from simulators.

Ax Nicolas Keriven 3/16/2026

Backward Oversmoothing: why is it hard to train deep Graph Neural Networks?

Analyzes oversmoothing problem in deep Graph Neural Networks and explores why networks fail to learn non-oversmoothed representations.

Ax Lakshita Dodeja, Karl Schmeckpeper, Shivam Vats, Thomas Weng, Mingxi Jia, George Konidaris, Stefanie Tellex 3/16/2026

Accelerating Residual Reinforcement Learning with Uncertainty Estimation

Proposes uncertainty estimation improvements to Residual Reinforcement Learning for faster adaptation of pretrained policies with sparse rewards.

Ax Rui Huang, Shitong Shao, Zikai Zhou, Pukun Zhao, Hangyu Guo, Tian Ye, Lichen Bai, Shuo Yang, Zeke Xie 3/16/2026

Accelerating Diffusion Model Training under Minimal Budgets: A Condensation-Based Perspective

Data condensation approach for training diffusion models with minimal computational budget by constructing smaller synthetic training datasets.

Ax Tianyin Liao, Ziwei Zhang, Yufei Sun, Chunyu Hu, Jianxin Li 3/16/2026

Invariant Graph Transformer for Out-of-Distribution Generalization

Graph transformer architecture designed for invariant learning to improve out-of-distribution generalization on graph-structured data.

Ax Sibylle Marcotte, Gabriel Peyr\'e, R\'emi Gribonval 3/16/2026

Intrinsic training dynamics of deep neural networks

Theoretical study of implicit bias in deep neural network training showing gradient flow induces learning of lower-dimensional parameter structures.

Ax Gyutae Oh, Jitae Shin 3/16/2026

UniPrompt-CL: Sustainable Continual Learning in Medical AI with Unified Prompt Pools

Continual learning framework with unified prompt pools for medical imaging tasks, addressing domain-specific challenges in adaptive AI.

Ax Arwen Bradley 3/16/2026

Local Mechanisms of Compositional Generalization in Conditional Diffusion

Analysis of compositional generalization mechanisms in conditional diffusion models, studying length generalization on controlled image generation tasks.

Ax Prabhat Karmakar, Sayan Gupta, Ilaksh Adlakha 3/16/2026

Extended Low-Rank Approximation Accelerates Learning of Elastic Response in Heterogeneous Materials

Low-rank approximation technique for accelerating machine learning models predicting mechanical properties of heterogeneous materials.

Ax Abhishek Moturu, Muhammad Muzammil, Anna Goldenberg, Babak Taati 3/16/2026

LiLAW: Lightweight Learnable Adaptive Weighting to Meta-Learn Sample Difficulty, Improve Noisy Training, Increase Fairness, and Effectively Use Synthetic Data

Lightweight meta-learning method using three parameters to dynamically adjust sample loss weights for noisy training, fairness, and synthetic data utilization.

Ax Krishu K Thapa, Reet Barik, Krishna Teja Chitty-Venkata, Murali Emani, Venkatram Vishwanath 3/16/2026

PreLoRA: Hybrid Pre-training of Vision Transformers with Full Training and Low-Rank Adapters

Hybrid pre-training approach using low-rank adapters alongside full training to reduce computational cost for vision transformer training.

Ax Xvyuan Liu, Xiangfei Qiu, Hanyin Cheng, Xingjian Wu, Chenjuan Guo, Bin Yang, Jilin Hu 3/16/2026

ASTGI: Adaptive Spatio-Temporal Graph Interactions for Irregular Multivariate Time Series Forecasting

Graph-based method for forecasting irregular multivariate time series in healthcare and finance with adaptive spatio-temporal interactions.

Ax Jonas Ngnaw\'e, Maxime Heuillet, Sabyasachi Sahoo, Yann Pequignot, Ola Ahmad, Audrey Durand, Fr\'ed\'eric Precioso, Christian Gagn\'e 3/16/2026

Robust Fine-Tuning from Non-Robust Pretrained Models: Mitigating Suboptimal Transfer With Epsilon-Scheduling

Method for robust fine-tuning non-robust pretrained models using epsilon-scheduling to achieve adversarial robustness and task adaptation simultaneously.

Ax Jiayi Li, Flora D. Salim 3/16/2026

DRIFT-Net: A Spectral--Coupled Neural Operator for PDEs Learning

Neural operator architecture combining spectral and coupling methods for efficiently learning partial differential equation dynamics.

Ax Harshwardhan Fartale, Ashish Kattamuri, Rahul Raja, Arpita Vats, Ishita Prasad, Akshata Kishore Moharir 3/16/2026

Disentangling Recall and Reasoning in Transformer Models through Layer-wise Attention and Activation Analysis

Analysis of transformer internals distinguishing recall from reasoning mechanisms through layer-wise attention and activation patterns for interpretability.

Ax Giorgos Nikolaou, Tommaso Mencattini, Donato Crisostomi, Andrea Santilli, Yannis Panagakis, Emanuele Rodol\`a 3/16/2026

Language Models are Injective and Hence Invertible

Mathematical proof that transformer language models are injective, enabling exact input recovery from representations despite nonlinear components.

Ax Rikard Vinge, Isabelle Wittmann, Jannik Schneider, Michael Marszalek, Luis Gilch, Thomas Brunschwiler, Conrad M Albrecht 3/16/2026

NeuCo-Bench: A Novel Benchmark Framework for Neural Embeddings in Earth Observation

Benchmark framework for evaluating neural compression and representation learning on earth observation satellite imagery tasks.

Ax Kemou Li, Qizhou Wang, Yue Wang, Fengpeng Li, Jun Liu, Bo Han, Jiantao Zhou 3/16/2026

LLM Unlearning with LLM Beliefs

Method for unlearning harmful content from LLMs by analyzing belief redistribution in probability space, avoiding unwanted side effects of gradient ascent.

Ax Tingkai Yan, Haodong Wen, Binghui Li, Kairong Luo, Wenguang Chen, Kaifeng Lyu 3/16/2026

Larger Datasets Can Be Repeated More: A Theoretical Analysis of Multi-Epoch Scaling in Linear Regression

Theoretical analysis of data scaling laws in linear regression when training multiple epochs on limited datasets, relevant to LLM training efficiency.

Ax Pramudita Satria Palar, Paul Saves, Rommel G. Regis, Koji Shimoyama, Shigeru Obayashi, Nicolas Verstaevel, Joseph Morlier 3/16/2026

Global Sensitivity Analysis for Engineering Design Based on Individual Conditional Expectations

Global sensitivity analysis technique for engineering design using individual conditional expectations to improve interpretability of black-box models.

Ax Sonal Prabhune, Balaji Padmanabhan, Kaushik Dutta 3/16/2026

Information-Consistent Language Model Recommendations through Group Relative Policy Optimization

Method to improve LLM consistency and reliability across semantically equivalent prompts using group relative policy optimization for business-critical applications.

Ax Damian Hodel, Jevin D. West 3/16/2026

Epistemic diversity across language models mitigates knowledge collapse

Study demonstrating that ensemble diversity across language models mitigates knowledge collapse from training on model-generated outputs.

Ax Taeyun Kim 3/16/2026

Structural Incompatibility of Differentiable Sorting and Within-Vector Rank Normalization

Theoretical analysis proving structural incompatibility between differentiable sorting operators and rank normalization techniques.

Ax Jiawen Chen, Qi Shao, Mingtong Zhou, Duxin Chen, Wenwu Yu 3/16/2026

CCMamba: Topologically-Informed Selective State-Space Networks on Combinatorial Complexes for Higher-Order Graph Learning

Selective state-space networks on combinatorial complexes for higher-order graph learning using topological deep learning.

Ax Evandro S. Ortigossa, Guy Lutsker, Eran Segal 3/16/2026

MoHETS: Long-term Time Series Forecasting with Mixture-of-Heterogeneous-Experts

Mixture-of-experts approach with heterogeneous experts for capturing multi-scale temporal dynamics in long-horizon time series forecasting.

Ax Ali Forootani, Raffaele Iervolino 3/16/2026

Learnable Koopman-Enhanced Transformer-Based Time Series Forecasting with Spectral Control

Integration of Koopman operator theory with transformer architectures for time series forecasting with learnable spectral parameterizations.

Ax Kevin Zhai, Sabbir Mollah, Zhenyi Wang, Mubarak Shah 3/16/2026

CORE: Context-Robust Remasking for Diffusion Language Models

Decoding strategy for masked diffusion language models that dynamically adjusts token retention based on context coverage.

Ax Nghia Nguyen, Tianjiao Ding, Ren\'e Vidal 3/16/2026

Hierarchical Concept Embedding & Pursuit for Interpretable Image Classification

Interpretable image classification using hierarchical concept embeddings recovered from vision-language model latent spaces.

Ax Zhen Bi, Xueshu Chen, Luoyang Sun, Yuhang Yao, Qing Shen, Jungang Lou, Cheng Deng 3/16/2026

RooflineBench: A Benchmarking Framework for On-Device LLMs via Roofline Analysis

Benchmarking framework using roofline analysis to characterize performance of small language models on resource-constrained edge hardware.

Ax Chethana Prasad Kabgere, Shylaja SS 3/16/2026

On the Geometric Coherence of Global Aggregation in Federated Graph Neural Networks

Federated learning approach addressing heterogeneous graph structures in distributed GNN training across multiple clients.

Ax Shubhangi Upasani, Chen Wu, Jay Rainton, Bo Li, Urmish Thakker, Changran Hu, Qizheng Zhang 3/16/2026

Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls

Study of many-shot in-context learning as test-time adaptation for LLMs, analyzing benefits and reliability limits with open-source models.

Ax Jonas Landsgesell, Pascal Knoll 3/16/2026

Distributional Regression with Tabular Foundation Models: Evaluating Probabilistic Predictions via Proper Scoring Rules

Evaluation framework using proper scoring rules for assessing distributional predictions from tabular foundation models beyond point estimates.

Ax Hiroki Naganuma, Atish Agarwala, Priya Kasimbeg, George E. Dahl 3/16/2026

What do near-optimal learning rate schedules look like?

Search procedure to identify optimal learning rate schedule shapes for neural network training across different workloads.

Ax Amit Singh, Vedant Nipane, Pulkit Agrawal, Jatin Kishnani, Sairanjan Mishra 3/16/2026

H2LooP Spark Preview: Continual Pretraining of Large Language Models for Low-Level Embedded Systems Code

Continual pretraining of LLMs specialized for low-level embedded systems code generation, targeting underrepresented hardware domains.

Ax Michael I. Jordan, Yixin Wang, Angela Zhou 3/16/2026

Data-Driven Influence Functions for Optimization-Based Causal Inference

Algorithm for approximating Gateaux derivatives in causal inference when distributions must be estimated from data.

Ax Huiming Zhang, Haoyu Wei, Guang Cheng 3/16/2026

Tight Non-asymptotic Inference via Sub-Gaussian Intrinsic Moment Norm

Statistical methods for estimating sub-Gaussian distribution parameters using intrinsic moment norms in non-asymptotic learning.

Ax Yichuan Deng, Zhao Song, Kaijun Yuan, Tianyi Zhou 3/16/2026

Why Softmax Attention Outperforms Linear Attention

Comparative analysis of softmax vs linear attention mechanisms in transformer architectures, examining computational efficiency tradeoffs.

Ax Hemanth Saratchandran, Sameera Ramasinghe, Simon Lucey 3/16/2026

From Activation to Initialization: Scaling Insights for Optimizing Neural Fields

Theoretical framework studying initialization and activation function scaling in neural fields for computer vision signal representation.

Ax Guido Di Federico, Louis J. Durlofsky 3/16/2026

Latent diffusion models for parameterization and data assimilation of facies-based geomodels

Latent diffusion models for geological parameterization and data assimilation, generating realistic geomodels with reduced variables for history matching.

Ax Jos\'e A. Carrillo, Yifan Chen, Daniel Zhengyu Huang, Jiaoyang Huang, Dongyi Wei 3/16/2026

Fisher-Rao Gradient Flow: Geodesic Convexity and Functional Inequalities

Theoretical analysis of Fisher-Rao gradient flow dynamics under Wasserstein metric, establishing geodesic convexity and functional inequalities.

Ax Fangyi Wei, Jiajie Mo, Kai Zhang, Haipeng Shen, Srikantan Nagarajan, Fei Jiang 3/16/2026

Nested Deep Learning Model Towards A Foundation Model for Brain Signal Data

Nested deep learning foundation model for EEG/MEG spike detection in epilepsy diagnosis, addressing manual identification limitations.

Ax Jianwei Li, Jung-Eun Kim 3/16/2026

Superficial Safety Alignment Hypothesis

Analyzes brittleness of LLM safety alignment mechanisms, proposing superficial safety alignment hypothesis explaining why standard alignment approaches are vulnerable.

Ax Pablo de los Riscos, Fernando J. Corbacho 3/16/2026

Active Causal Structure Learning with Latent Variables: Towards Learning to Detour in Autonomous Robots

Active causal structure learning framework enabling autonomous robots and AGI agents to dynamically construct causal models of environmental interactions.

Ax Xialie Zhuang, Zhikai Jia, Jianjin Li, Zhenyu Zhang, Li Shen, Zheng Cao, Shiwei Liu 3/16/2026

Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More

Training paradigm integrating masked language modeling with next-token prediction to improve in-context retrieval in large language models.

Ax Deyu Bo, Songhua Liu, Xinchao Wang 3/16/2026

Understanding Dataset Distillation via Spectral Filtering

Spectral filtering framework unifying dataset distillation methods by interpreting them as filters affecting feature correlation eigenvalues.

Ax Daoze Zhang, Zhijian Bao, Sihang Du, Zhiyi Zhao, Kuangling Zhang, Dezheng Bao, Yang Yang 3/16/2026

Re2: A Consistency-ensured Dataset for Full-stage Peer Review and Multi-turn Rebuttal Discussions

Dataset of peer review discussions and rebuttals to support automated manuscript evaluation and improve scientific publishing workflow efficiency.

Ax Jonathan Garc\'ia, Philipp Petersen 3/16/2026

Minimax learning rates for estimating binary classifiers under margin conditions

Theoretical analysis of minimax learning rates for binary classification under geometric margin conditions with horizon function decision boundaries.