Isolater - Feed

Ax Harshwardhan Fartale, Ashish Kattamuri, Rahul Raja, Arpita Vats, Ishita Prasad, Akshata Kishore Moharir 3/16/2026

Disentangling Recall and Reasoning in Transformer Models through Layer-wise Attention and Activation Analysis

Analysis of transformer internals distinguishing recall from reasoning mechanisms through layer-wise attention and activation patterns for interpretability.

Ax Giorgos Nikolaou, Tommaso Mencattini, Donato Crisostomi, Andrea Santilli, Yannis Panagakis, Emanuele Rodol\`a 3/16/2026

Language Models are Injective and Hence Invertible

Mathematical proof that transformer language models are injective, enabling exact input recovery from representations despite nonlinear components.

Ax Rikard Vinge, Isabelle Wittmann, Jannik Schneider, Michael Marszalek, Luis Gilch, Thomas Brunschwiler, Conrad M Albrecht 3/16/2026

NeuCo-Bench: A Novel Benchmark Framework for Neural Embeddings in Earth Observation

Benchmark framework for evaluating neural compression and representation learning on earth observation satellite imagery tasks.

Ax Kemou Li, Qizhou Wang, Yue Wang, Fengpeng Li, Jun Liu, Bo Han, Jiantao Zhou 3/16/2026

LLM Unlearning with LLM Beliefs

Method for unlearning harmful content from LLMs by analyzing belief redistribution in probability space, avoiding unwanted side effects of gradient ascent.

Ax Tingkai Yan, Haodong Wen, Binghui Li, Kairong Luo, Wenguang Chen, Kaifeng Lyu 3/16/2026

Larger Datasets Can Be Repeated More: A Theoretical Analysis of Multi-Epoch Scaling in Linear Regression

Theoretical analysis of data scaling laws in linear regression when training multiple epochs on limited datasets, relevant to LLM training efficiency.

Ax Pramudita Satria Palar, Paul Saves, Rommel G. Regis, Koji Shimoyama, Shigeru Obayashi, Nicolas Verstaevel, Joseph Morlier 3/16/2026

Global Sensitivity Analysis for Engineering Design Based on Individual Conditional Expectations

Global sensitivity analysis technique for engineering design using individual conditional expectations to improve interpretability of black-box models.

Ax Sonal Prabhune, Balaji Padmanabhan, Kaushik Dutta 3/16/2026

Information-Consistent Language Model Recommendations through Group Relative Policy Optimization

Method to improve LLM consistency and reliability across semantically equivalent prompts using group relative policy optimization for business-critical applications.

Ax Damian Hodel, Jevin D. West 3/16/2026

Epistemic diversity across language models mitigates knowledge collapse

Study demonstrating that ensemble diversity across language models mitigates knowledge collapse from training on model-generated outputs.

Ax Taeyun Kim 3/16/2026

Structural Incompatibility of Differentiable Sorting and Within-Vector Rank Normalization

Theoretical analysis proving structural incompatibility between differentiable sorting operators and rank normalization techniques.

Ax Jiawen Chen, Qi Shao, Mingtong Zhou, Duxin Chen, Wenwu Yu 3/16/2026

CCMamba: Topologically-Informed Selective State-Space Networks on Combinatorial Complexes for Higher-Order Graph Learning

Selective state-space networks on combinatorial complexes for higher-order graph learning using topological deep learning.

Ax Evandro S. Ortigossa, Guy Lutsker, Eran Segal 3/16/2026

MoHETS: Long-term Time Series Forecasting with Mixture-of-Heterogeneous-Experts

Mixture-of-experts approach with heterogeneous experts for capturing multi-scale temporal dynamics in long-horizon time series forecasting.

Ax Ali Forootani, Raffaele Iervolino 3/16/2026

Learnable Koopman-Enhanced Transformer-Based Time Series Forecasting with Spectral Control

Integration of Koopman operator theory with transformer architectures for time series forecasting with learnable spectral parameterizations.

Ax Kevin Zhai, Sabbir Mollah, Zhenyi Wang, Mubarak Shah 3/16/2026

CORE: Context-Robust Remasking for Diffusion Language Models

Decoding strategy for masked diffusion language models that dynamically adjusts token retention based on context coverage.

Ax Nghia Nguyen, Tianjiao Ding, Ren\'e Vidal 3/16/2026

Hierarchical Concept Embedding & Pursuit for Interpretable Image Classification

Interpretable image classification using hierarchical concept embeddings recovered from vision-language model latent spaces.

Ax Zhen Bi, Xueshu Chen, Luoyang Sun, Yuhang Yao, Qing Shen, Jungang Lou, Cheng Deng 3/16/2026

RooflineBench: A Benchmarking Framework for On-Device LLMs via Roofline Analysis

Benchmarking framework using roofline analysis to characterize performance of small language models on resource-constrained edge hardware.

Ax Chethana Prasad Kabgere, Shylaja SS 3/16/2026

On the Geometric Coherence of Global Aggregation in Federated Graph Neural Networks

Federated learning approach addressing heterogeneous graph structures in distributed GNN training across multiple clients.

Ax Shubhangi Upasani, Chen Wu, Jay Rainton, Bo Li, Urmish Thakker, Changran Hu, Qizheng Zhang 3/16/2026

Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls

Study of many-shot in-context learning as test-time adaptation for LLMs, analyzing benefits and reliability limits with open-source models.

Ax Jonas Landsgesell, Pascal Knoll 3/16/2026

Distributional Regression with Tabular Foundation Models: Evaluating Probabilistic Predictions via Proper Scoring Rules

Evaluation framework using proper scoring rules for assessing distributional predictions from tabular foundation models beyond point estimates.

Ax Hiroki Naganuma, Atish Agarwala, Priya Kasimbeg, George E. Dahl 3/16/2026

What do near-optimal learning rate schedules look like?

Search procedure to identify optimal learning rate schedule shapes for neural network training across different workloads.

Ax Amit Singh, Vedant Nipane, Pulkit Agrawal, Jatin Kishnani, Sairanjan Mishra 3/16/2026

H2LooP Spark Preview: Continual Pretraining of Large Language Models for Low-Level Embedded Systems Code

Continual pretraining of LLMs specialized for low-level embedded systems code generation, targeting underrepresented hardware domains.

Ax Michael I. Jordan, Yixin Wang, Angela Zhou 3/16/2026

Data-Driven Influence Functions for Optimization-Based Causal Inference

Algorithm for approximating Gateaux derivatives in causal inference when distributions must be estimated from data.

Ax Huiming Zhang, Haoyu Wei, Guang Cheng 3/16/2026

Tight Non-asymptotic Inference via Sub-Gaussian Intrinsic Moment Norm

Statistical methods for estimating sub-Gaussian distribution parameters using intrinsic moment norms in non-asymptotic learning.

Ax Yichuan Deng, Zhao Song, Kaijun Yuan, Tianyi Zhou 3/16/2026

Why Softmax Attention Outperforms Linear Attention

Comparative analysis of softmax vs linear attention mechanisms in transformer architectures, examining computational efficiency tradeoffs.

Ax Hemanth Saratchandran, Sameera Ramasinghe, Simon Lucey 3/16/2026

From Activation to Initialization: Scaling Insights for Optimizing Neural Fields

Theoretical framework studying initialization and activation function scaling in neural fields for computer vision signal representation.

Ax Guido Di Federico, Louis J. Durlofsky 3/16/2026

Latent diffusion models for parameterization and data assimilation of facies-based geomodels

Latent diffusion models for geological parameterization and data assimilation, generating realistic geomodels with reduced variables for history matching.

Ax Jos\'e A. Carrillo, Yifan Chen, Daniel Zhengyu Huang, Jiaoyang Huang, Dongyi Wei 3/16/2026

Fisher-Rao Gradient Flow: Geodesic Convexity and Functional Inequalities

Theoretical analysis of Fisher-Rao gradient flow dynamics under Wasserstein metric, establishing geodesic convexity and functional inequalities.

Ax Fangyi Wei, Jiajie Mo, Kai Zhang, Haipeng Shen, Srikantan Nagarajan, Fei Jiang 3/16/2026

Nested Deep Learning Model Towards A Foundation Model for Brain Signal Data

Nested deep learning foundation model for EEG/MEG spike detection in epilepsy diagnosis, addressing manual identification limitations.

Ax Jianwei Li, Jung-Eun Kim 3/16/2026

Superficial Safety Alignment Hypothesis

Analyzes brittleness of LLM safety alignment mechanisms, proposing superficial safety alignment hypothesis explaining why standard alignment approaches are vulnerable.

Ax Pablo de los Riscos, Fernando J. Corbacho 3/16/2026

Active Causal Structure Learning with Latent Variables: Towards Learning to Detour in Autonomous Robots

Active causal structure learning framework enabling autonomous robots and AGI agents to dynamically construct causal models of environmental interactions.

Ax Xialie Zhuang, Zhikai Jia, Jianjin Li, Zhenyu Zhang, Li Shen, Zheng Cao, Shiwei Liu 3/16/2026

Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More

Training paradigm integrating masked language modeling with next-token prediction to improve in-context retrieval in large language models.

Ax Deyu Bo, Songhua Liu, Xinchao Wang 3/16/2026

Understanding Dataset Distillation via Spectral Filtering

Spectral filtering framework unifying dataset distillation methods by interpreting them as filters affecting feature correlation eigenvalues.

Ax Daoze Zhang, Zhijian Bao, Sihang Du, Zhiyi Zhao, Kuangling Zhang, Dezheng Bao, Yang Yang 3/16/2026

Re2: A Consistency-ensured Dataset for Full-stage Peer Review and Multi-turn Rebuttal Discussions

Dataset of peer review discussions and rebuttals to support automated manuscript evaluation and improve scientific publishing workflow efficiency.

Ax Jonathan Garc\'ia, Philipp Petersen 3/16/2026

Minimax learning rates for estimating binary classifiers under margin conditions

Theoretical analysis of minimax learning rates for binary classification under geometric margin conditions with horizon function decision boundaries.

Ax Vinod Raman, Hilal Asi, Satyen Kale 3/16/2026

AdaBoN: Adaptive Best-of-N Alignment

Prompt-adaptive Best-of-N alignment strategy using reward models to reduce computational cost of test-time alignment for language models.

Ax Thai-Hoc Vu, Ngo Hoang Tu, Thien Huynh-The, Kyungchun Lee, Sunghwan Kim, Miroslav Voznak, Quoc-Viet Pham 3/16/2026

Integration of TinyML and LargeML: A Survey of 6G and Beyond

Survey on integrating TinyML and LargeML for 6G networks, covering deep learning applications in mobile systems, autonomous vehicles, and smart services.

Ax Konstantin Dobler, Desmond Elliott, Gerard de Melo 3/16/2026

Token Distillation: Attention-aware Input Embeddings For New Tokens

Attention-aware embedding initialization method for new tokens in LLMs without expensive retraining, addressing vocabulary limitations in specialized domains.

Ax Tobias J. Riedlinger, Kira Maag, Hanno Gottschalk 3/16/2026

Towards Reliable Detection of Empty Space: Conditional Marked Point Processes for Object Detection

Conditional marked point processes for reliable object detection uncertainty quantification, addressing miscalibrated confidence scores in neural networks.

Ax Amirabbas Hojjati, Lu Li, Ibrahim Hameed, Anis Yazidi, Pedro G. Lind, Rabindra Khadka 3/16/2026

From Video to EEG: Adapting Joint Embedding Predictive Architecture to Uncover Saptiotemporal Dynamics in Brain Signal Analysis

Self-supervised learning approach adapting joint embedding architecture from video to EEG signals for brain activity analysis with limited labeled data.

Ax Maida Wang, Xiao Xue, Mingyang Gao, Peter V. Coveney 3/16/2026