Isolater - Feed

Ax Hefei Xu, Le Wu, Yu Wang, Min Hou, Han Wu, Zhen Zhang, Meng Wang 3/20/2026

VC-Soup: Value-Consistency Guided Multi-Value Alignment for Large Language Models

Proposes VC-Soup method for aligning LLMs with multiple potentially conflicting human values through value-consistency guided optimization.

Ax Jing Wang, Jie Shen, Amar Sra, Qiaomin Xie, Jeremy C Weiss 3/20/2026

LLM-Augmented Computational Phenotyping of Long Covid

LLM-augmented computational phenotyping framework for discovering clinical subphenotypes in Long COVID through iterative hypothesis generation and evidence extraction.

Ax Xunzhuo Liu, Hao Wu, Huamin Chen, Bowei He, Xue Liu 3/20/2026

Conflict-Free Policy Languages for Probabilistic ML Predicates: A Framework and Case Study with the Semantic Router DSL

Framework for detecting conflicts in policy languages that use probabilistic ML predicates, applied to semantic router DSL for LLM routing systems.

Ax Wenshuo Wang, Fan Zhang 3/20/2026

Gradient-Informed Temporal Sampling Improves Rollout Accuracy in PDE Surrogate Training

Improves PDE surrogate model training through gradient-informed temporal sampling strategies that optimize rollout accuracy under fixed data budgets.

Ax Sindhuja Madabushi, Arda Dogan, Jonathan Liu, Dian Chen, Dong S. Ha, Sook Shin, Sam H. Noh, Jin-Hee Cho 3/20/2026

AGRI-Fidelity: Evaluating the Reliability of Listenable Explanations for Poultry Disease Detection

Proposes AGRI-Fidelity framework to evaluate reliability of explainable AI for poultry disease detection in noisy farm environments.

Ax Philippe Formont, Maxime Darrin, Ismail Ben Ayed, Pablo Piantanida 3/20/2026

MolRGen: A Training and Evaluation Setting for De Novo Molecular Generation with Reasonning Models

Framework for evaluating reasoning-based LLMs on de novo molecular generation and drug discovery without requiring ground-truth molecule pairs.

Ax Jiaxin Liu 3/20/2026

Discovering What You Can Control: Interventional Boundary Discovery for Reinforcement Learning

Proposes Interventional Boundary Discovery to identify causal state dimensions agents can control, using Pearl's do-operator for causal identification.

Ax Haocheng Luo, Zehang Deng, Thanh-Toan Do, Mehrtash Harandi, Dinh Phung, Trung Le 3/20/2026

Sharpness-Aware Minimization in Logit Space Efficiently Enhances Direct Preference Optimization

Addresses the squeezing effect in Direct Preference Optimization (DPO) for LLM alignment using sharpness-aware minimization in logit space.

Ax Gregory N. Frank 3/20/2026

Detection Is Cheap, Routing Is Learned: Why Refusal-Based Alignment Evaluation Fails

Studies alignment evaluation in LLMs by examining political censorship in Chinese language models, focusing on routing mechanisms beyond concept detection and refusal behaviors.

Ax Simon M. Brealy, Lawrence A. Bull, Daniel S. Brennan, Pauline Beltrando, Anders Sommer, Nikolaos Dervilis, Keith Worden 3/20/2026

On Additive Gaussian Processes for Wind Farm Power Prediction

Additive Gaussian processes for wind farm power prediction using population-based structural health monitoring perspective.

Ax Zijin Gu, Tatiana Likhomanenko, Vimal Thilak, Jason Ramapuram, Navdeep Jaitly 3/20/2026

Path-Constrained Mixture-of-Experts

Path-constrained mixture-of-experts architecture constraining expert routing paths to improve statistical efficiency and meaningful parameter structure.

Ax Zhanqi Zhang, Shun Li, Bernardo L. Sabatini, Mikio Aoi, Gal Mishne 3/20/2026

ALIGN: Adversarial Learning for Generalizable Speech Neuroprosthesis

ALIGN: adversarial learning framework for session-invariant speech neuroprosthesis decoding from brain-computer interfaces.

Ax Kaiyang Li, Shihao Ji, Zhipeng Cai, Wei Li 3/20/2026

Approximate Subgraph Matching with Neural Graph Representations and Reinforcement Learning

Neural graph representation learning with RL for approximate subgraph matching, an NP-hard problem in graph analysis.

Ax Nived Rajaraman, Audrey Huang, Miro Dudik, Robert Schapire, Dylan J. Foster, Akshay Krishnamurthy 3/20/2026

Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum

Autocurriculum training methods with provable benefits for chain-of-thought reasoning in language models with reduced data/compute costs.

Ax Amirhossein Roknilamouki, Arnob Ghosh, Eylem Ekici, Ness B. Shroff 3/20/2026

Escaping Offline Pessimism: Vector-Field Reward Shaping for Safe Frontier Exploration

Vector-field reward shaping for offline RL to enable safe exploration near dataset boundaries using simulator confidence.

Ax Muhammad Mubashar, Fabio Cuzzolin 3/20/2026

Epistemic Generative Adversarial Networks

Epistemic GANs using Dempster-Shafer theory to improve output diversity and architectural enhancements for generative models.

Ax Xiaojing Ye 3/20/2026

Mathematical Foundations of Deep Learning

Comprehensive book on mathematical foundations of deep learning covering neural network approximation theory, optimal control, RL, and generative models.

Ax Yifan Zhang, Liang Zheng 3/20/2026

RE-SAC: Disentangling aleatoric and epistemic risks in bus fleet control: A stable and robust ensemble DRL approach

RE-SAC: ensemble deep reinforcement learning for bus fleet control that disentangles aleatoric and epistemic uncertainty.

Ax Jianan Nie, Peng Gao 3/20/2026

FlowMS: Flow Matching for De Novo Structure Elucidation from Mass Spectra

Flow matching approach for de novo molecular structure elucidation from mass spectra using deep generative models.

Ax Arundhathi Dev, Justin Zhan 3/20/2026

Self-Tuning Sparse Attention: Multi-Fidelity Hyperparameter Optimization for Transformer Acceleration

AFBS-BO framework for automated hyperparameter optimization of sparse attention mechanisms in transformers via adaptive fidelity Bayesian optimization.

Ax Zhuoyue Chen, Kechao Cai 3/20/2026

Towards Noise-Resilient Quantum Multi-Armed and Stochastic Linear Bandits

Quantum multi-armed and stochastic linear bandits algorithms robust to noise in NISQ devices, achieving quadratic speedups over classical methods.

Ax Haechan Kim, Soohyun Ryu, Gyouk Chu, Doohyuk Jang, Eunho Yang 3/20/2026

Discounted Beta--Bernoulli Reward Estimation for Sample-Efficient Reinforcement Learning with Verifiable Rewards

Sample-efficient reward estimation method for RL with verifiable rewards in large language model post-training.

Ax Haoxin Liu, Harshavardhan Kamarthi, Zhiyuan Zhao, Hongjie Chen, B. Aditya Prakash 3/20/2026

Seeking Universal Shot Language Understanding Solutions

Training suite for film shot language understanding using vision-language models to match expert cinematographic analysis.

Ax Chengxuan Lu, Shukuan Wang, Yanjie Li, Wei Liu, Shiji Jin, Fuyuan Qian, Peiming Li, Baigui Sun, Yang Liu 3/20/2026

AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models

Distributed asynchronous RL framework for Vision-Language-Action models with integrated trainable world models.

Ax Zongfang Liu, Shengkun Tang, Yifan Shen, Huan Wang, Xin Yuan 3/20/2026

AIMER: Calibration-Free Task-Agnostic MoE Pruning

Calibration-free pruning method for Mixture-of-Experts language models to reduce memory and serving overhead.

Ax Yinan Xia, Haotian Zhang, Huiming Wang 3/20/2026

Balancing the Reasoning Load: Difficulty-Differentiated Policy Optimization with Length Redistribution for Efficient and Robust Reinforcement Learning

Policy optimization approach addressing overthinking in large reasoning models through difficulty-differentiated training.

Ax Konwoo Kim, Suhas Kotha, Yejin Choi, Tatsunori Hashimoto, Nick Haber, Percy Liang 3/20/2026

Data-efficient pre-training by scaling synthetic megadocs

Study on synthetic data augmentation for efficient pre-training with better loss scaling using synthetic megadocs.

Ax Sheng Pan, Niansheng Tang 3/20/2026

Beyond Passive Aggregation: Active Auditing and Topology-Aware Defense in Decentralized Federated Learning

Research on active auditing framework against backdoor attacks in decentralized federated learning systems.

Ax Zheng Lin, Ons Aouedi, Wei Ni, Symeon Chatzinotas, Xianhao Chen 3/20/2026

GAPSL: A Gradient-Aligned Parallel Split Learning on Heterogeneous Data

GAPSL: gradient-aligned parallel split learning for federated learning on heterogeneous data, reducing client computational load.

Ax Khushiyant 3/20/2026

HEP Statistical Inference for UAV Fault Detection: CLs, LRT, and SBI Applied to Blade Damage

Transfers statistical methods from particle physics for UAV propeller fault detection using spectral features and neural inference.

Ax Amanda A. Howard, Nicholas Zolman, Bruno Jacob, Steven L. Brunton, Panos Stinis 3/20/2026

SINDy-KANs: Sparse identification of non-linear dynamics through Kolmogorov-Arnold networks

SINDy-KANs combines Kolmogorov-Arnold networks with sparse identification to learn interpretable equations for nonlinear dynamical systems.

Ax Hoang T. H. Cao, Hai D. V. Trinh, Tho Quan, Lan V. Truong 3/20/2026

Transformers Learn Robust In-Context Regression under Distributional Uncertainty

Shows Transformers learn robust in-context regression under distributional uncertainty without restrictive assumptions on data and noise.

Ax Shenggui Li, Chao Wang, Yikai Zhu, Yubo Wang, Fan Yin, Shuai Shi, Yefei Chen, Xiaomin Dong, Qiaoling Chen, Jin Pan, Ji Li, Laixin Xie, Yineng Zhang, Lei Yu, Yonggang Wen, Ivor Tsang, Tianwei Zhang 3/20/2026

SpecForge: A Flexible and Efficient Open-Source Training Framework for Speculative Decoding

SpecForge: open-source training framework for speculative decoding draft models, improving LLM inference latency through token batching.

Ax Jiahao Zhang, Yilong Wang, Suhang Wang 3/20/2026

Attack by Unlearning: Unlearning-Induced Adversarial Attacks on Graph Neural Networks

Demonstrates adversarial attacks on GNNs exploitable through unlearning mechanisms designed for GDPR compliance in graph learning systems.

Ax Xuan Liu, Xiaobin Chang 3/20/2026

Elastic Weight Consolidation Done Right for Continual Learning

Systematic analysis of Elastic Weight Consolidation for continual learning, identifying issues with importance estimation and weight regularization methods.

Ax Kevin Song 3/20/2026

Evaluating Model-Free Policy Optimization in Masked-Action Environments via an Exact Blackjack Oracle

Evaluates model-free policy optimization algorithms using exact blackjack oracle with ground-truth benchmarks for discrete stochastic control.

Ax Anh-Tuan Dao, Driss Matrouf, Mickael Rouvier, Nicholas Evans 3/20/2026

Enhancing Multi-Corpus Training in SSL-Based Anti-Spoofing Models: Domain-Invariant Feature Extraction

Investigates multi-corpus training in speech spoofing detection using self-supervised learning, finding domain-specific biases harm generalization.

Ax Yige Liu, Dexuan Xu, Zimai Guo, Yongzhi Cao, Hanpin Wang 3/20/2026

Revisiting Label Inference Attacks in Vertical Federated Learning: Why They Are Vulnerable and How to Defend

Studies label inference attacks in vertical federated learning, analyzing vulnerabilities when passive parties infer active party's labels and proposing defenses.

Ax Zhicong Lu, Zichuan Lin, Wei Jia, Changyuan Tian, Deheng Ye, Peiguang Li, Li Jin, Nayu Liu, Guangluan Xu, Wei Feng 3/20/2026

HISR: Hindsight Information Modulated Segmental Process Rewards For Multi-turn Agentic Reinforcement Learning

HISR proposes segmental process rewards for multi-turn RL in LLM agents, addressing sparse reward propagation and credit assignment in long-horizon decision-making tasks.

Ax Chen Zhang, Liwei Liu, Jun Tao, Xiaoyu Yang, Xuenan Xu, Kai Chen, Bowen Zhou, Wen Wu, Chao Zhang 3/20/2026

STEP: Scientific Time-Series Encoder Pretraining via Cross-Domain Distillation

Investigates transfer learning from audio and time-series foundation models to scientific time-series via cross-domain distillation.

Ax Chen Sun, Beilin Xu, Boheng Tan, Jiacheng Wang, Yuefeng Sun, Rite Bo, Ying He, Yaqiang Zang, Pinghua Gong 3/20/2026

OCP: Orthogonal Constrained Projection for Sparse Scaling in Industrial Commodity Recommendation

Proposes OCP method for improving item embeddings in large-scale commodity recommendation systems.

Ax Koichi Tanaka, Ren Kishimoto, Bushun Kawagishi, Yusuke Narita, Yasuo Yamamoto, Nobuyuki Shimizu, Yuta Saito 3/20/2026

Off-Policy Learning with Limited Supply

Studies off-policy learning in contextual bandits with supply constraints for recommendation and advertising systems.

Ax Hao Wang, Licheng Pan, Zhichao Chen, Chunyuan Zheng, Zhixuan Chu, Xiaoxi Li, Yuan Lu, Xinggao Liu, Haoxuan Li, Zhouchen Lin 3/20/2026

CausalRM: Causal-Theoretic Reward Modeling for RLHF from Observational User Feedbacks

Causal-theoretic approach for reward modeling using observational user feedback instead of expensive annotated data for RLHF alignment.

Ax Gabriele Carrino, Andrea Sassella, Nicolo Brunello, Federico Toschi, Mark James Carman 3/20/2026

Are complicated loss functions necessary for teaching LLMs to reason?

Ablation study examining necessity of components in Group Relative Policy Optimization for teaching LLMs reasoning and mathematical ability.

Ax Marcio Augusto Sampaio, Paulo Henrique Ranazzi, Martin Julian Blunt 3/20/2026

Enhancing the Parameterization of Reservoir Properties for Data Assimilation Using Deep VAE-GAN

Deep VAE-GAN approach improving reservoir parameterization for data assimilation in petroleum reservoir simulation.

Ax Channe Chwa, Xinle Wu, Yao Lu 3/20/2026

Automatic Configuration of LLM Post-Training Pipelines

AutoPipe framework for automatically configuring LLM post-training pipelines combining supervised fine-tuning and reinforcement learning under budget constraints.

Ax Hisham Husain, Valentin De Bortoli, Richard Nock 3/20/2026

Seasoning Generative Models for a Generalization Aftertaste

Study on using discriminators to enhance generative model training across GANs, weak learner frameworks, and diffusion models.

Ax Yizhou Han, Di Wu, Blesson Varghese 3/20/2026

DriftGuard: Mitigating Asynchronous Data Drift in Federated Learning

Method mitigating asynchronous data drift in federated learning where different devices experience different distribution shifts.

Ax Marcela Palejova 3/20/2026

Authority-Level Priors: An Under-Specified Constraint in Hierarchical Predictive Processing

Neuroscience framework introducing authority-level priors to hierarchical predictive processing for understanding autonomic regulation.

Ax Steffen Dereich, Thang Do, Arnulf Jentzen 3/20/2026

Uniform a priori bounds and error analysis for the Adam stochastic gradient descent optimization method

Theoretical error analysis of Adam optimizer for training deep neural networks and beyond, addressing open research gaps.