Isolater - Feed

Ax Anna Babarczy, Andras Lukacs, Peter Vedres, Zeteny Bujka 3/20/2026

Do Large Language Models Possess a Theory of Mind? A Comparative Evaluation Using the Strange Stories Paradigm

Comparative study evaluating whether LLMs demonstrate Theory of Mind capabilities using psychological paradigms.

Ax Fangrui Huang, Souhad Chbeir, Arpandeep Khatua, Sheng Wang, Sijun Tan, Kenan Ye, Lily Bailey, Merryn Daniel, Ryan Louie, Sanmi Koyejo, Ehsan Adeli 3/20/2026

TherapyGym: Evaluating and Aligning Clinical Fidelity and Safety in Therapy Chatbots

TherapyGym evaluation framework for therapy chatbots measuring clinical fidelity and safety using psychotherapy rating scales.

Ax Wei Chen, Guoyang Ju, Yuanyuan Qi 3/20/2026

How Confident Is the First Token? An Uncertainty-Calibrated Prompt Optimization Framework for Large Language Model Classification and Understanding

Uncertainty-calibrated prompt optimization framework for LLM classification that measures model confidence to improve reliability.

Ax Yifei Zhu, Songpo Yang, Jiangnan Zhu, Junyan Jiang 3/20/2026

Agentic Framework for Political Biography Extraction

LLM-based agent framework for automated extraction of structured political biography data from unstructured sources at scale.

Ax Penghao Liang, Mengwei Yuan, Jianan Liu, Jing Yang, Xianyou Li, Weiran Yan, Yichao Wu 3/20/2026

DynaRAG: Bridging Static and Dynamic Knowledge in Retrieval-Augmented Generation

DynaRAG framework extending RAG with dynamic API calls for time-sensitive queries; includes sufficiency classification and reranking.

Ax Trishita Dhara, Siddhesh Sheth 3/20/2026

Beyond Accuracy: An Explainability-Driven Analysis of Harmful Content Detection

Analysis of explainability in harmful content detection models, examining predictions on borderline and contextual cases.

Ax Zhenwei Tang, Arun Verma, Zijian Zhou, Zhaoxuan Wu, Alok Prakash, Daniela Rus, Bryan Kian Hsiang Low 3/20/2026

MineDraft: A Framework for Batch Parallel Speculative Decoding

MineDraft framework for batch parallel speculative decoding to accelerate LLM inference by parallelizing draft and verification stages.

Ax Harshita Diddee, Gregory Yauney, Swabha Swayamdipta, Daphne Ippolito 3/20/2026

BenchBrowser -- Collecting Evidence for Evaluating Benchmark Validity

Tool for collecting granular metadata about language model benchmarks to verify alignment with practitioner goals and test coverage.

Ax Jianan Pan, Kejie Huang 3/20/2026

PCOV-KWS: Multi-task Learning for Personalized Customizable Open Vocabulary Keyword Spotting

Multi-task learning framework for personalized open-vocabulary keyword spotting with privacy and customization for voice assistants.

Ax Jianan Pan, Yuanming Zhang, Kejie Huang 3/20/2026

ProKWS: Personalized Keyword Spotting via Collaborative Learning of Phonemes and Prosody

Keyword spotting framework integrating phoneme learning with personalized prosody modeling for speaker-specific voice recognition.

Ax Yongchao Martin Ma, Zhongzhun Deng 3/20/2026

Understanding the Relationship Between Firms' AI Technology Innovation and Consumer Complaints

Study examining relationship between firms' AI technology innovation investments and consumer complaint patterns.

Ax Kyeonghyun Yoo, Wooyong Jung, Namkyung Yoon, Sangmin Lee, Sanghong Kim, Hwangnam Kim 3/20/2026

KD-EKF: Knowledge-Distilled Adaptive Covariance EKF for Robust UWB/PDR Indoor Localization

Adaptive Extended Kalman Filter using knowledge distillation for improved UWB/PDR indoor localization under NLOS conditions.

Ax J. Clayton Kerce 3/20/2026

Engineering Verifiable Modularity in Transformers via Per-Layer Supervision

Method for increasing transformer modularity and interpretability through per-layer supervision to overcome distributed redundancy.

Ax Hao Ke 3/20/2026

Quine: Realizing LLM Agents as Native POSIX Processes

Quine runtime that implements LLM agents as native POSIX processes using OS-level isolation and scheduling instead of application-layer frameworks.

Ax Natalia Wojak-Strzelecka, Szymon Bobek, Grzegorz J. Nalepa, Jerzy Stefanowski 3/20/2026

Towards Differentiating Between Failures and Domain Shifts in Industrial Data Streams

Method for distinguishing between system failures and domain shifts in industrial data streams using anomaly detection.

Ax Scott Thornton 3/20/2026

Semantic Chameleon: Corpus-Dependent Poisoning Attacks and Defenses in RAG Systems

Study of poisoning attacks against RAG systems where adversaries corrupt retrieval corpora to manipulate LLM outputs; includes defenses.

Ax Sunil Prakash 3/20/2026

The Provenance Paradox in Multi-Agent LLM Routing: Delegation Contracts and Attested Identity in LDP

Research on multi-agent LLM routing systems showing that quality-based delegation can fail when agents misreport performance; proposes delegation contracts to address this.

Ax Zhaohui Geoffrey Wang 3/20/2026

NANOZK: Layerwise Zero-Knowledge Proofs for Verifiable Large Language Model Inference

NANOZK: Zero-knowledge proof system enabling cryptographic verification that proprietary LLM API outputs actually used claimed models.

Ax Naichuan Zheng, Hailun Xia, Zepeng Sun, Weiyi Li, Yujia Wang 3/20/2026

S3T-Former: A Purely Spike-Driven State-Space Topology Transformer for Skeleton Action Recognition

S3T-Former: Energy-efficient spike-driven state-space transformer for skeleton-based action recognition on resource-constrained edge devices.

Ax Yi Ting Shen, Kentaroh Toyoda, Alex Leung 3/20/2026

MCP-38: A Comprehensive Threat Taxonomy for Model Context Protocol Systems (v1.0)

MCP-38: Protocol-specific threat taxonomy with 38 threat categories for Model Context Protocol systems derived through systematic methodology.

Ax Timothy Oh 3/20/2026

A Synthesizable RTL Implementation of Predictive Coding Networks

Synthesizable RTL implementation of predictive coding networks enabling online, distributed hardware learning as alternative to backpropagation.

Ax Yi Yu, Junzhuo Ma, Chenghuang Shen, Xingyan Liu, Jing Gu, Hangyi Sun, Guangquan Hu, Jianfeng Liu, Weiting Liu, Mingyue Pu, Yu Wang, Zhengdong Xiao, Rui Xie, Longjiu Luo, Qianrong Wang, Gurong Cui, Honglin Qiao, Wenlian Lu 3/20/2026

Lightweight Adaptation for LLM-based Technical Service Agent: Latent Logic Augmentation and Robust Noise Reduction

Lightweight LLM adaptation framework for technical service agents using latent logic augmentation and noise reduction techniques.

Ax Prince Zizhuang Wang, Shuli Jiang 3/20/2026

SLEA-RL: Step-Level Experience Augmented Reinforcement Learning for Multi-Turn Agentic Training

SLEA-RL: Step-level experience augmentation for multi-turn LLM agent training enabling dynamic retrieval and leveraging accumulated episode experiences.

Ax Ratun Rahman, Dinh C. Nguyen 3/20/2026

Probabilistic Federated Learning on Uncertain and Heterogeneous Data with Model Personalization

Meta-BayFL: Probabilistic federated learning framework with Bayesian neural networks for heterogeneous data and model personalization.

Ax Daisuke Yasui, Toshitaka Matsuki, Hiroshi Sato 3/20/2026

Uncovering Latent Phase Structures and Branching Logic in Locomotion Policies: A Case Study on HalfCheetah

Study uncovering latent phase structures and branching logic in deep RL locomotion policies for HalfCheetah control task interpretability.

Ax Hao Ma, Zhiqiang Pu, Yang Liu, Xiaolin Ai 3/20/2026

Enhancing Reinforcement Learning Fine-Tuning with an Online Refiner

Dynamic constraints framework for reinforcement learning fine-tuning that adapts constraints based on model capabilities to balance stability and optimization.

Ax Thomas Duboudin, Xavier Fontaine, Etienne Andrier, Lionel Guillou, Alexandre Filiot, Thalyssa Baiocco-Rodrigues, Antoine Olivier, Alberto Romagnoni, John Klein, Jean-Baptiste Schiratti 3/20/2026

CytoSyn: a Foundation Diffusion Model for Histopathology -- Tech Report

CytoSyn: Foundation diffusion model for computational histopathology enabling cell segmentation and tumor analysis from digitized slides.

Ax Ciprian Paduraru, Petru-Liviu Bouruc, Alin Stefanescu 3/20/2026

A Trace-Based Assurance Framework for Agentic AI Orchestration: Contracts, Testing, and Governance

Trace-based assurance framework for agentic AI orchestration with contracts, testing, and governance for LLM-coordinated multi-agent systems.

Ax Mohammed Rahman Sherif Khan Mohammad, Ardhendu Behera, Sandip Pradhan, Swagat Kumar, Amr Ahmed 3/20/2026

Training-Only Heterogeneous Image-Patch-Text Graph Supervision for Advancing Few-Shot Learning Adapters

Training-only framework for few-shot CLIP adapters using heterogeneous image-patch-text graph supervision without inference cost overhead.

Ax Rahul D Ray 3/20/2026

ARTEMIS: A Neuro Symbolic Framework for Economically Constrained Market Dynamics

ARTEMIS: Neuro-symbolic framework combining neural operators and SDEs for interpretable, arbitrage-free quantitative finance models.

Ax Santosh Arron 3/20/2026

Discovery of Bimodal Drift Rate Structure in FRB 20240114A: Evidence for Dual Emission Regions

Discovery of bimodal drift rate structure in fast radio burst FRB 20240114A using unsupervised machine learning for astrophysics analysis.

Ax Sahil Tyagi, Feiyi Wang 3/20/2026

Tula: Optimizing Time, Cost, and Generalization in Distributed Large-Batch Training

Tula: Optimization framework for distributed large-batch training balancing communication overhead, computation cost, and generalization performance.

Ax Hefei Xu, Le Wu, Yu Wang, Min Hou, Han Wu, Zhen Zhang, Meng Wang 3/20/2026

VC-Soup: Value-Consistency Guided Multi-Value Alignment for Large Language Models

VC-Soup: Method for aligning LLMs with multiple conflicting human values using value-consistency guidance for trustworthy AI development.

Ax Jing Wang, Jie Shen, Amar Sra, Qiaomin Xie, Jeremy C Weiss 3/20/2026

LLM-Augmented Computational Phenotyping of Long Covid

Grace Cycle: LLM-augmented computational phenotyping framework for discovering clinical subtypes in Long COVID through iterative hypothesis generation and evidence extraction.

Ax Jianwei Zhang 3/20/2026

Intellectual Stewardship: Re-adapting Human Minds for Creative Knowledge Work in the Age of AI

Conceptual framework proposing intellectual stewardship for how humans should adapt their roles in creative knowledge work alongside AI systems.

Ax Yuhao Dong, Zuyan Liu, Shulin Tian, Yongming Rao, Ziwei Liu 3/20/2026

Insight-V++: Towards Advanced Long-Chain Visual Reasoning with Multimodal Large Language Models

Insight-V++: Multi-agent visual reasoning framework for MLLMs enabling long-chain reasoning with high-quality training data and optimized pipelines.

Ax Juan P Wachs 3/20/2026

Final Report for the Workshop on Robotics & AI in Medicine

Workshop report on advancing robotics and AI in healthcare, highlighting coordination needs between engineering and clinical priorities for safety and reliability.

Ax Marwa Abdulhai, Isadora White, Yanming Wan, Ibrahim Qureshi, Joel Leibo, Max Kleiman-Weiner, Natasha Jaques 3/20/2026

How LLMs Distort Our Written Language

User study demonstrating that extensive LLM use for writing assistance alters voice, tone, and meaning of human text with 70% increase in essay length.

Ax Mohammad Qazim Bhat, Yufan Huang, Niket Agarwal, Hao Wang, Michael Woods, John Kenyon, Tsung-Yi Lin, Xiaodong Yang, Ming-Yu Liu, Kevin Xie 3/20/2026

VLM-AutoDrive: Post-Training Vision-Language Models for Safety-Critical Autonomous Driving Events

Post-training framework adapting vision-language models for safety-critical autonomous driving event detection in dashcam footage through temporal alignment.

Ax Xavier Cadet, Aditya Vikram Singh, Harsh Mamania, Edward Koh, Alex Fitts, Dirk Van Bruggen, Simona Boboila, Peter Chin, Alina Oprea 3/20/2026

Retrieval-Augmented LLMs for Security Incident Analysis

RAG-based system using LLMs for automated cybersecurity incident analysis through targeted log filtering across multiple data sources.

Ax Wenshuo Wang, Fan Zhang 3/20/2026

Gradient-Informed Temporal Sampling Improves Rollout Accuracy in PDE Surrogate Training

Gradient-informed temporal sampling strategy for training neural PDE surrogates, improving rollout accuracy beyond uniform and augmentation-based sampling.

Ax Philippe Formont, Maxime Darrin, Ismail Ben Ayed, Pablo Piantanida 3/20/2026

MolRGen: A Training and Evaluation Setting for De Novo Molecular Generation with Reasonning Models

MolRGen benchmark and training framework for evaluating reasoning-based LLMs on de novo molecular generation for drug discovery without ground-truth molecule pairs.

Ax Jiaxin Liu 3/20/2026

Discovering What You Can Control: Interventional Boundary Discovery for Reinforcement Learning

Interventional Boundary Discovery method using causal inference to identify controllable state dimensions in reinforcement learning with confounded distractors.

Ax Haocheng Luo, Zehang Deng, Thanh-Toan Do, Mehrtash Harandi, Dinh Phung, Trung Le 3/20/2026

Sharpness-Aware Minimization in Logit Space Efficiently Enhances Direct Preference Optimization

Sharpness-aware minimization technique in logit space addressing squeezing effect in Direct Preference Optimization for LLM alignment.

Ax Tamer Shanableh 3/20/2026

LRConv-NeRV: Low Rank Convolution for Efficient Neural Video Compression

Low-rank convolution optimization for neural video compression (NeRV) reducing computational cost and memory for resource-constrained environments.

Ax Gregory N. Frank 3/20/2026

Detection Is Cheap, Routing Is Learned: Why Refusal-Based Alignment Evaluation Fails

Analysis of LLM alignment through concept routing rather than detection, studying political censorship in Chinese language models across nine open-weight models.

Ax Sara Pohland, Xenofon Foukas, Ganesh Ananthanarayanan, Andrey Kolobov, Sanjeev Mehrotra, Bozidar Radunovic, Ankit Verma 3/20/2026

Offload or Overload: A Platform Measurement Study of Mobile Robotic Manipulation Workloads

Measurement study comparing computational costs of mobile robotic manipulation workloads across onboard, edge, and cloud GPU platforms using foundation models.

Ax Nikhil Gosala, B. Ravi Kiran, Senthil Yogamani, Abhinav Valada 3/20/2026

Sparse3DTrack: Monocular 3D Object Tracking Using Sparse Supervision

Sparse supervised learning framework for monocular 3D object tracking in videos, reducing annotation requirements for autonomous agent perception.

Ax Jasmine Rienecker, Katarina Mpofu, Naman Goel, Siddhartha Datta, Jun Zhao, Oscar Danielsson, Fredrik Thorsen 3/20/2026

Auditing Preferences for Brands and Cultures in LLMs

ChoiceEval framework for auditing brand and cultural preference biases in LLMs used as market intermediaries affecting consumer choices.

Ax Kaiyang Li, Shihao Ji, Zhipeng Cai, Wei Li 3/20/2026

Approximate Subgraph Matching with Neural Graph Representations and Reinforcement Learning

Neural graph representation method using reinforcement learning to solve approximate subgraph matching, an NP-hard problem in graph analysis.