Isolater - Feed

Ax Iv\'an Arcuschin, David Chanin, Adri\`a Garriga-Alonso, Oana-Maria Camburu 2/20/2026

Biases in the Blind Spot: Detecting What LLMs Fail to Mention

Automated black-box pipeline detects unverbalized biases in LLM reasoning where models hide internal biases in plausible-sounding chain-of-thought explanations.

Ax Ivan Vuli\'c, Adam Grycner, Quentin de Laroussilhe, Jonas Pfeiffer 2/20/2026

LoRA-Squeeze: Simple and Effective Post-Tuning and In-Tuning Compression of LoRA Modules

LoRA-Squeeze compresses LoRA modules through post-tuning and in-tuning methods to simplify rank selection and improve deployment efficiency for fine-tuning LLMs.

Ax Yejin Kim, Wilbert Pumacay, Omar Rayyan, Max Argus, Winson Han, Eli VanderBilt, Jordi Salvador, Abhay Deshpande, Rose Hendrix, Snehal Jauhri, Shuo Liu, Nur Muhammad Mahi Shafiullah, Maya Guru, Ainaz Eftekhar, Karen Farley, Donovan Clay, Jiafei Duan, Arjun Guru, Piper Wolters, Alvaro Herrasti, Ying-Chun Lee, Georgia Chalvatzaki, Yuchen Cui, Ali Farhadi, Dieter Fox, Ranjay Krishna 2/20/2026

MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation

MolmoSpaces is an open ecosystem for robot navigation and manipulation with diverse benchmarks for evaluating generalization in real-world robotic tasks.

Ax Sher Badshah, Ali Emami, Hassan Sajjad 2/20/2026

SCOPE: Selective Conformal Optimized Pairwise LLM Judging

SCOPE framework for calibrated pairwise LLM judging with statistical guarantees, reducing miscalibration and systematic biases in evaluations.

Ax Ziyi Li, Hui Ma, Fei Xing, Chunjiong Zhang, Ming Yan 2/20/2026

GraFSTNet: Graph-based Frequency SpatioTemporal Network for Cellular Traffic Prediction

Graph-based frequency spatio-temporal network for cellular traffic prediction capturing complex temporal dynamics and spatial correlations.

Ax Jiarong Liang, Max Ku, Ka-Hei Hui, Ping Nie, Wenhu Chen 2/20/2026

VisPhyWorld: Probing Physical Reasoning via Code-Driven Video Reconstruction

VisPhyWorld evaluation framework tests whether MLLMs reason about physical dynamics through code-driven video reconstruction tasks.

Ax Ricardo E. Gonzalez Penuela, Crescentia Jung, Sharon Y Lin, Ruiying Hu, Shiri Azenkot 2/20/2026

How Multimodal Large Language Models Support Access to Visual Information: A Diary Study With Blind and Low Vision People

Diary study examining how multimodal LLMs assist blind and low vision users accessing visual information through conversational interfaces.

Ax Qingqing Zhu, Qiao Jin, Tejas S. Mathai, Yin Fang, Zhizheng Wang, Yifan Yang, Maame Sarfo-Gyamfi, Benjamin Hou, Ran Gu, Praveen T. S. Balamuralikrishna, Kenneth C. Wang, Ronald M. Summers, Zhiyong Lu 2/20/2026

CT-Bench: A Benchmark for Multimodal Lesion Understanding in Computed Tomography

CT-Bench benchmark dataset with 20K+ lesion annotations from CT studies for multimodal lesion understanding and report generation.

Ax Muhammad J. Alahmadi, Peng Gao, Feiyi Wang, Dongkuan Xu 2/20/2026

Accelerating Large-Scale Dataset Distillation via Exploration-Exploitation Optimization

Optimization method for dataset distillation using exploration-exploitation to compress large datasets while retaining model performance.

Ax Beatrix M. G. Nielsen, Emanuele Marconato, Luigi Gresele, Andrea Dittadi, Simon Buchholz 2/20/2026

Logit Distance Bounds Representational Similarity

Proves logit distance bounds representational similarity for discriminative models including autoregressive language models.

Ax Amal Lahchim, Lambros Athanasiou 2/20/2026

Intracoronary Optical Coherence Tomography Image Processing and Vessel Classification Using Machine Learning

Automated ML pipeline for vessel segmentation and classification in intracoronary OCT images using preprocessing and artifact removal.

Ax Ha Na Cho, Sairam Sutari, Alexander Lopez, Hansen Bow, Kai Zheng 2/20/2026

Building Safe and Deployable Clinical Natural Language Processing under Temporal Leakage Constraints

Addresses temporal leakage in clinical NLP models for hospital discharge planning, ensuring safe deployment with realistic performance estimates.

Ax Pengfei Zhang, Tianxin Xie, Minghao Yang, Li Liu 2/20/2026

Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis

Resp-Agent system uses active adversarial curriculum learning for multimodal respiratory sound generation and disease diagnosis.

Ax Chenda Duan, Yipeng Zhang, Sotaro Kanai, Yuanyi Ding, Atsuro Daida, Pengyue Yu, Tiancheng Zheng, Naoto Kuroda, Shaun A. Hussain, Eishi Asano, Hiroki Nariai, Vwani Roychowdhury 2/20/2026

Omni-iEEG: A Large-Scale, Comprehensive iEEG Dataset and Benchmark for Epilepsy Research

Large-scale intracranial EEG dataset and benchmark for epilepsy research enabling automated seizure localization and clinical workflow support.

Ax Binchuan Qi 2/20/2026

Conjugate Learning Theory: Uncovering the Mechanisms of Trainability and Generalization in Deep Neural Networks

Theoretical framework using convex conjugate duality to characterize trainability and generalization properties of deep neural networks.

Ax Md. Najib Hasan, Touseef Hasan, Souvika Sarkar 2/20/2026

Are LLMs Ready to Replace Bangla Annotators?

Evaluates LLMs as zero-shot annotators for Bangla hate speech detection, examining reliability and bias in low-resource language settings.

Ax Yixue Zhang, Kun Wu, Zhi Gao, Zhen Zhao, Pei Ren, Zhiyuan Xu, Fei Liao, Xinhua Wang, Shichao Fan, Di Wu, Qiuxuan Feng, Meng Li, Zhengping Che, Chang Liu, Jian Tang 2/20/2026

RoboGene: Boosting VLA Pre-training via Diversity-Driven Agentic Framework for Real-World Task Generation

RoboGene uses agentic framework to automatically generate diverse robotic manipulation tasks, addressing data scarcity in VLA pre-training.

Ax Wenxuan Ding, Nicholas Tomlin, Greg Durrett 2/20/2026

Calibrate-Then-Act: Cost-Aware Exploration in LLM Agents

Framework for LLM agents to reason about cost-uncertainty tradeoffs when deciding whether to explore environments before committing to answers.

Ax Nils Palumbo, Sarthak Choudhary, Jihye Choi, Prasad Chalasani, Somesh Jha 2/20/2026

Policy Compiler for Secure Agentic Systems

PCAS system enforces deterministic authorization policies in LLM agents for customer service, workflows, and compliance without relying on prompts.

Ax Saud Alghumayjan, Ming Yi, Bolun Xu 2/20/2026

A Few-Shot LLM Framework for Extreme Day Classification in Electricity Markets

Few-shot LLM classification framework predicts electricity market price spikes using natural language prompts with system state features.

Ax Lei Han, Mohamed Abdel-Aty, Zubayer Islam, Chenzhu Wang 2/20/2026

Real-time Secondary Crash Likelihood Prediction Excluding Post Primary Crash Features

ML framework for predicting secondary traffic crashes using real-time data features excluding post-crash information.

Ax Karan Bali, Jack Stanley, Praneet Suresh, Danilo Bzdok 2/20/2026

Quantifying LLM Attention-Head Stability: Implications for Circuit Universality

Studies stability of transformer attention-head circuits across model instances to determine if interpretability findings are universal or idiosyncratic.

Ax Haoxiang Sun, Lizhen Xu, Bing Zhao, Wotao Yin, Wei Wang, Boyu Yang, Rui Wang, Hu Wei 2/20/2026

DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning

DeepVision-103K dataset of 103K visually diverse mathematical problems for multimodal LLM reinforcement learning with verifiable rewards.

Ax Zhangyi Liu, Huaizhi Qu, Xiaowei Yin, He Sun, Yanjun Han, Tianlong Chen, Zhun Deng 2/20/2026

PETS: A Principled Framework Towards Optimal Trajectory Allocation for Efficient Test-Time Self-Consistency

PETS framework for principled trajectory allocation in test-time self-consistency scaling with sample efficiency optimization.

Ax Yongzhong Xu 2/20/2026

Low-Dimensional and Transversely Curved Optimization Dynamics in Grokking

Geometric analysis of grokking in transformers showing low-dimensional optimization dynamics with PCA of attention trajectories.

Ax Xidong Wang, Shuqi Guo, Yue Shen, Junying Chen, Jian Wang, Jinjie Gu, Ping Zhang, Lei Liu, Benyou Wang 2/20/2026

LiveClin: A Live Clinical Benchmark without Leakage

LiveClin live benchmark for clinical LLM evaluation using contemporary peer-reviewed cases updated biannually to prevent contamination.

Ax Ayush Roy, Tahsin Fuad Hassan, Roshan Ayyalasomayajula, Vishnu Suresh Lokhande 2/20/2026

Attending to Routers Aids Indoor Wireless Localization

Attention-based weighting mechanism for improving Wi-Fi signal router aggregation in indoor localization.

Ax Alex Moody, Penina Axelrad, Rebecca Russell 2/20/2026

Machine Learning Argument of Latitude Error Model for LEO Satellite Orbit and Covariance Correction

ML method for correcting LEO satellite orbit propagation and uncertainty quantification under atmospheric drag mismodeling.

Ax Victoria Lin, Louis-Philippe Morency, Eli Ben-Michael 2/20/2026

Omitted Variable Bias in Language Models Under Distribution Shift

Analysis of how distribution shifts in language models relate to omitted variable bias and mitigation approaches.

Ax Victoria Lin, Xinnuo Xu, Rachel Lawrence, Risa Ueno, Amit Sharma, Javier Gonzalez, Niranjani Prasad 2/20/2026

Better Think Thrice: Learning to Reason Causally with Double Counterfactual Consistency

Method improving LLM causal reasoning on counterfactual questions via double counterfactual consistency learning.

Ax Xingyu Dang, Rohit Agarwal, Rodrigo Porto, Anirudh Goyal, Liam H Fowl, Sanjeev Arora 2/20/2026

Escaping the Cognitive Well: Efficient Competition Math with Off-the-Shelf Models

Efficient inference pipeline achieving IMO-level math reasoning with off-the-shelf models at reduced computational cost.

Ax Zifan Wang, Riccardo De Santi, Xiaoyu Mo, Michael M. Zavlanos, Andreas Krause, Karl H. Johansson 2/20/2026

Efficient Tail-Aware Generative Optimization via Flow Model Fine-Tuning

Fine-tuning diffusion and flow models for tail-aware generative optimization with control over reward distribution.

Ax Ammar Kheder, Helmi Toropainen, Wenqing Peng, Samuel Ant\~ao, Jia Chen, Zhi-Song Liu, Michael Boy 2/20/2026

TopoFlow: Physics-guided Neural Networks for high-resolution air quality prediction

Physics-guided neural network for air quality prediction using topography and wind direction.

Ax Elan Schonfeld, Elias Wisnia 2/20/2026

Learning under noisy supervision is governed by a feedback-truth gap

Two-timescale analysis showing feedback-truth gap governs learning under noisy supervision across neural networks and human studies.

Ax Zhicheng Zhang, Ziyan Wang, Yali Du, Fei Fang 2/20/2026

VAM: Verbalized Action Masking for Controllable Exploration in RL Post-Training -- A Chess Case Study

Verbalized Action Masking method for controllable exploration in RL post-training of LLMs with chess case study.

Ax Hanna Herasimchyk, Robin Labryga, Tomislav Prusina, S\"oren Laue 2/20/2026

A Residual-Aware Theory of Position Bias in Transformers

Residual-aware theoretical analysis explaining position bias in transformer attention mechanisms and cumulative attention rollout.

Ax Zeliang Zhang, Xiaodong Liu, Hao Cheng, Hao Sun, Chenliang Xu, Jianfeng Gao 2/20/2026

Training Large Reasoning Models Efficiently via Progressive Thought Encoding

Progressive Thought Encoding method for efficient training of large reasoning models via parameter-efficient RL fine-tuning.

Ax Rachitesh Kumar, Omar Mouchtaki 2/20/2026

What is the Value of Censored Data? An Exact Analysis for the Data-driven Newsvendor

Theoretical analysis of data-driven newsvendor problem with censored demand data and regret bounds.

Ax Jianliang He, Leda Wang, Siyu Chen, Zhuoran Yang 2/20/2026

On the Mechanism and Dynamics of Modular Addition: Fourier Features, Lottery Ticket, and Grokking

Mechanistic analysis of how two-layer networks learn Fourier features for modular addition with theoretical training dynamics explanation.

Ax Yaroslav Solovko 2/20/2026

ML-driven detection and reduction of ballast information in multi-modal datasets

Framework for detecting and reducing ballast information across structured, semi-structured, and unstructured multimodal datasets.

Ax F. S. Menezes, M. C. F. G. Barretto, E. Q. C. Garcia, T. A. E. Ferreira, J. G. Alvez 2/20/2026

Construction of a classification model for dementia among Brazilian adults aged 50 and over

Dementia classification model for Brazilian adults using variable selection and multivariable analysis from ELSI-Brazil dataset.

Ax Philip Sosnin, Jodie Knapp, Fraser Kennedy, Josh Collyer, Calvin Tsay 2/20/2026

Exact Certification of Data-Poisoning Attacks Using Mixed-Integer Programming

Mixed-integer programming framework providing sound and complete certification guarantees for worst-case data poisoning attacks.

Ax Chuqin Geng, Li Zhang, Haolin Ye, Ziyu Zhao, Yuhe Jiang, Tara Saba, Xinyu Wang, Xujie Si 2/20/2026

Beyond Message Passing: A Symbolic Alternative for Expressive and Interpretable Graph Learning

Symbolic alternative to GNNs with improved expressivity beyond 1-Weisfeiler-Lehman barrier and fine-grained interpretability.

Ax Chuqin Geng, Li Zhang, Mark Zhang, Haolin Ye, Ziyu Zhao, Xujie Si 2/20/2026

Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints

NSGGM: neuro-symbolic framework for molecule generation combining neural proposals with symbolic constraints for controllability.

Ax Sourav Chakraborty, Amit Kiran Rege, Claire Monteleoni, Lijun Chen 2/20/2026

Multi-Agent Lipschitz Bandits

Communication-free decentralized multi-agent bandit protocol with Lipschitz-structured action spaces and hard collision constraints.

Ax Sourav Chakraborty, Amit Kiran Rege, Claire Monteleoni, Lijun Chen 2/20/2026

A Unified Framework for Locality in Scalable MARL

Unified framework for exploiting locality in scalable multi-agent RL with relaxed conditions on exponential decay property.

Ax Yongzhong Xu 2/20/2026

Early-Warning Signals of Grokking via Loss-Landscape Geometry

Studies grokking transition via loss-landscape geometry on sequence-learning tasks SCAN and Dyck-1 using commutator defect metrics.

Ax Zachary Coalson, Beth Sohler, Aiden Gabriel, Sanghyun Hong 2/20/2026

Fail-Closed Alignment for Large Language Models

Fail-closed alignment design principle for robust LLM safety through redundant refusal mechanisms across latent features.

Ax Leo Marchyok, Zachary Coalson, Sungho Keum, Sooel Son, Sanghyun Hong 2/20/2026

Discovering Universal Activation Directions for PII Leakage in Language Models

UniLeak: mechanistic interpretability framework identifying universal activation directions that trigger PII leakage in language models.

Ax Rahul Thomas, Teo Kitanovski, Micah Goldblum, Arka Pal 2/20/2026

Dynamic Delayed Tree Expansion For Improved Multi-Path Speculative Decoding

Dynamic Delayed Tree Expansion improves multi-path speculative decoding for faster LLM token sampling verification.