Isolater - Feed

Ax Yuan Cao, Dezhi Ran, Yuzhe Guo, Mengzhou Wu, Simin Chen, Linyi Li, Wei Yang, Tao Xie 3/11/2026

An Empirical Study and Theoretical Explanation on Task-Level Model-Merging Collapse

Empirical study of catastrophic performance degradation in merged task-specialist LLMs, analyzing representation and task-specific feature conflicts.

Ax Zhuoran Deng, Yizhi Zhang, Ziyi Zhang, Wan Shen 3/11/2026

Telogenesis: Goal Is All U Need

Goal-conditioned system using endogenous priority functions based on epistemic gaps: ignorance, surprise, and staleness for attention allocation.

Ax Andrew Murray, Danial Dervovic, Alberto Pozanco, Michael Cashmore 3/11/2026

GenePlan: Evolving Better Generalized PDDL Plans using Large Language Models

GenePlan framework combining LLM-assisted evolutionary algorithms to generate domain-dependent PDDL planners that minimize plan length across problem instances.

Ax Pietro Dell'Oglio, Alessandro Bondielli, Francesco Marcelloni, Lucia C. Passaro 3/11/2026

Enhancing Debunking Effectiveness through LLM-based Personality Adaptation

LLM-based approach for generating personalized fake news debunking messages using Big Five personality trait alignment.

Ax Vera V. Vishnyakova 3/11/2026

Context Engineering: From Prompts to Corporate Multi-Agent Architecture

Context engineering discipline for designing agent decision environments in multi-step autonomous systems, extending beyond prompt engineering to full informational management.

Ax Arash Shahmansoori 3/11/2026

PRECEPT: Planning Resilience via Experience, Context Engineering & Probing Trajectories A Unified Framework for Test-Time Adaptation with Compositional Rule Learning and Pareto-Guided Prompt Evolution

PRECEPT framework for test-time adaptation in LLM agents with structured rule retrieval, conflict-aware memory, and adversarial knowledge detection capabilities.

Ax Zuhao Zhang, Chengyue Yu, Yuante Li, Chenyi Zhuang, Linjian Mo, Shuai Li 3/11/2026

MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants

Benchmark evaluating LLMs' ability to generate interactive HTML-based MiniApps with visual interfaces and customized interaction logic beyond static text.

Ax Xin An, Jingyi Cai, Xiangyang Chen, Huayao Liu, Peiting Liu, Peng Wang, Bei Yang, Xiuwen Zhu, Yongfan Chen, Baoyu Hou, Shuzhao Li, Weidong Ren, Fan Yang, Jiangtao Zhang, Xiaoxiao Xu, Lin Qu 3/11/2026

Logics-Parsing-Omni Technical Report

Unified multimodal parsing framework with hierarchical taxonomy for documents, images, and audio-visual streams using progressive parsing paradigm.

Ax Aman Sharma, Paras Chopra 3/11/2026

EsoLang-Bench: Evaluating Genuine Reasoning in Large Language Models via Esoteric Programming Languages

Benchmark using esoteric programming languages to evaluate genuine reasoning vs memorization in LLMs, preventing benchmark gaming through economically irrational language choices.

Ax Ming Wen, Kun Yang, Jingyu Zhang, Yuxuan Liu, shiwen cui, Shouling Ji, Xingjun Ma 3/11/2026

OOD-MMSafe: Advancing MLLM Safety from Harmful Intent to Hidden Consequences

Safety benchmark for multimodal LLMs focusing on consequence-driven safety for autonomous and embodied agents, introducing OOD-MMSafe with 455 curated query-image pairs.

Ax Peng Sun, Huawen Shen, Yi Ban, Tianfan Fu, Yanbo Wang, Yuqiang Li 3/11/2026

Does the Question Really Matter? Training-Free Data Selection for Vision-Language SFT

Training-free data selection method for vision-language models that identifies samples requiring genuine cross-modal reasoning rather than linguistic shortcuts.

Ax Xiaoxing Wang, Ning Liao, Shikun Wei, Chen Tang, Feiyu Xiong 3/11/2026

AutoAgent: Evolving Cognition and Elastic Memory Orchestration for Adaptive Agents

Self-evolving multi-agent framework with dynamic cognition and elastic memory orchestration for adaptive agents in non-stationary environments.

Ax Jonah Brown-Cohen, David Lindner, Rohin Shah 3/11/2026

Quantifying the Necessity of Chain of Thought through Opaque Serial Depth

Theoretical analysis of chain-of-thought necessity in LLMs through opaque serial depth, formalizing computation constraints in Transformers.

Ax Tung Tran, Danilo Vasconcellos Vargas, Khoat Than 3/11/2026

LCA: Local Classifier Alignment for Continual Learning

Continual learning approach using local classifier alignment on pre-trained models to mitigate catastrophic forgetting in changing environments.

Ax Hongbo Bo, Jingyu Hu, Weiru Liu 3/11/2026

Influencing LLM Multi-Agent Dialogue via Policy-Parameterized Prompts

Policy-parameterized prompt framework for controlling LLM multi-agent dialogue behavior using lightweight state-action policies instead of ad hoc prompts.

Ax Yunhang Qian, Xiaobin Hu, Jiaquan Yu, Siyang Xin, Xiaokun Chen, Jiangning Zhang, Peng-Tao Jiang, Jiawei Liu, Hongwei Bran Li 3/11/2026

MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems

Unified benchmarking framework for multimodal medical multi-agent systems addressing architectural fragmentation and standardized evaluation.

Ax Jinyue Li, Yuci Liang, Qiankun Li, Xinheng Lyu, Jiayu Qian, Huabao Chen, Kun Wang, Zhigang Zeng, Anil Anthony Bharath, Yang Liu 3/11/2026

PathMem: Toward Cognition-Aligned Memory Transformation for Pathology MLLMs

Framework for integrating domain knowledge and diagnostic reasoning into pathology multimodal LLMs with cognition-aligned memory mechanisms.

Ax Ronald Doku 3/11/2026

The Confidence Gate Theorem: When Should Ranked Decision Systems Abstain?

Formal analysis of when confidence-based abstention improves ranked decision systems through rank-alignment and inversion zone conditions.

Ax Ann Yuan, Asma Ghandeharioun, Carter Blum, Alicia Machado, Jessica Hoffmann, Daphne Ippolito, Martin Wattenberg, Lucas Dixon, Katja Filippova 3/11/2026

Think Before You Lie: How Reasoning Improves Honesty

Study investigating how reasoning affects deceptive behavior in LLMs using moral trade-off datasets, finding reasoning increases honesty unlike humans.

Ax Chengyu Shen, Zhen Hao Wong, Runming He, Hao Liang, Meiyi Qiang, Zimo Meng, Zhengyang Zhao, Bohan Zeng, Zhengzhou Zhu, Bin Cui, Wentao Zhang 3/11/2026

Let's Verify Math Questions Step by Step

Research on verifying math questions in LLM training, focusing on question validity rather than just correct reasoning paths for mathematical reasoning tasks.

Ax Jatin Chhugani, Geonhwa Jeong, Bor-Yiing Su, Yunjie Pan, Hanmei Yang, Aayush Ankit, Jiecao Yu, Summer Deng, Yunqing Chen, Nadathur Satish, Changkyu Kim 3/11/2026

Unveiling the Potential of Quantization with MXFP4: Strategies for Quantization Error Reduction

Overflow-Aware Scaling and Macro Block Scaling techniques for MXFP4 quantization to reduce accuracy loss in LLM inference.

Ax The Verkor Team, Ravi Krishna, Suresh Krishna, David Chin 3/11/2026

Design Conductor: An agent autonomously builds a 1.5 GHz Linux-capable RISC-V CPU

Design Conductor: autonomous agent using frontier LLMs to build complete Linux-capable RISC-V CPU (VerCore) end-to-end in 12 hours.

Ax Zhengyuan Shi, Jingxin Wang, Tairan Cheng, Changran Xu, Weikang Qian, Qiang Xu 3/11/2026

CktEvo: Repository-Level RTL Code Benchmark for Design Evolution

CktEvo: repository-level RTL code benchmark for evaluating LLM performance on iterative hardware design evolution tasks.

Ax Mu-Chi Chen, Yu-Hung Kao, Po-Hsuan Huang, Shao-Chun Ho, Hsiang-Yu Tsou, I-Ting Wu, En-Ming Huang, Yu-Kai Hung, Wei-Po Hsin, Cheng Liang, Chia-Heng Tu, Shih-Hao Hung, Hsiang-Tsung Kung 3/11/2026

SiliconMind-V1: Multi-Agent Distillation and Debug-Reasoning Workflows for Verilog Code Generation

SiliconMind-V1: multi-agent LLM framework with debug-reasoning workflows for Verilog code generation without external verification tools.

Ax T. Baldi, D. Casini, A. Biondi 3/11/2026

ALADIN: Accuracy-Latency-Aware Design-space Inference Analysis for Embedded AI Accelerators

ALADIN: design-space inference analysis framework for mixed-precision quantized neural networks on embedded AI accelerators.

Ax Hiroki Fukui 3/11/2026

Alignment Is the Disease: Censorship Visibility and Alignment Constraint Complexity as Determinants of Collective Pathology in Multi-Agent LLM Systems

Experimental study of collective pathology in multi-agent LLM systems, investigating alignment constraints as source of iatrogenic harm.

Ax Jianlong Lei, Shashikant Ilager 3/11/2026

ARKV: Adaptive and Resource-Efficient KV Cache Management under Limited Memory Budget for Long-Context Inference in LLMs

ARKV: adaptive KV cache management framework for ultra-long context LLM inference with dynamic memory budget constraints.

Ax Sangkeum Lee 3/11/2026

Measurement-Free Ancilla Recycling via Blind Reset: A Cross-Platform Study on Superconducting and Trapped-Ion Processors

Cross-platform study of measurement-free ancilla recycling via blind reset on superconducting and trapped-ion quantum processors.

Ax Sales Aribe Jr., Gil Nicholas Cagande 3/11/2026

Benchmarking Federated Learning in Edge Computing Environments: A Systematic Review and Performance Evaluation

Systematic review and performance evaluation of federated learning techniques for edge computing environments with privacy and efficiency focus.

Ax Mohammed Cherifi 3/11/2026

Autonomous Edge-Deployed AI Agents for Electric Vehicle Charging Infrastructure Management

Auralink SDC: edge-deployed autonomous AI agents for managing EV charging infrastructure with improved fault detection and latency.

Ax Atousa Jafari, Mahdi Taheri, Hassan Ghasemzadeh Mohammadi, Christian Herglotz, Marco Platzner 3/11/2026

Sensitivity-Guided Framework for Pruned and Quantized Reservoir Computing Accelerators

Sensitivity-based pruning and quantization framework for compressing reservoir computing models with hardware efficiency trade-offs.

Ax Soumita Chatterjee, Sudip Ghosh, Tamal Ghosh, Hafizur Rahaman 3/11/2026

Architectural Design and Performance Analysis of FPGA based AI Accelerators: A Comprehensive Review

Review of FPGA-based AI accelerator architectural design and performance for deep learning tasks including NLP and autonomous decision-making.

Ax Mengqi Liao, Lu Wang, Chaoyun Zhang, Bo Qiao, Si Qin, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Huaiyu Wan 3/11/2026

Zipage: Maintain High Request Concurrency for LLM Reasoning through Compressed PagedAttention

Compressed PagedAttention method combining token-wise KV cache eviction for high-concurrency LLM reasoning with reduced memory bottlenecks.

Ax Musa Cim, Burak Topcu, Mahmut Taylan Kandemir 3/11/2026

Diagnosing FP4 inference: a layer-wise and block-wise sensitivity analysis of NVFP4 and MXFP4

Layer-wise sensitivity analysis of NVFP4 and MXFP4 quantization formats for LLM inference on advanced hardware architectures.

Ax Seungwoo Jeong, Heung-Il Suk 3/11/2026

Permutation-Equivariant 2D State Space Models: Theory and Canonical Architecture for Multivariate Time Series

State space models with permutation equivariance for multivariate time series modeling without artificial variable ordering.

Ax Hui-Ze Tan, Xiao-Wen Yang, Hao Chen, Jie-Jing Shao, Yi Wen, Yuteng Shen, Weihong Luo, Xiku Du, Lan-Zhe Guo, Yu-Feng Li 3/11/2026

Hindsight Credit Assignment for Long-Horizon LLM Agents

HCAPO framework integrating hindsight credit assignment to improve long-horizon LLM agent performance on multi-step tasks with sparse rewards.

Ax Muyukani Kizito 3/11/2026

Turn: A Language for Agentic Computation

Turn: compiled actor-based programming language with static schema typing for building autonomous agentic software that delegates inference to LLMs.

Ax Sahal Sajeer, Krish Patel, Oscar Chung, Joel Song Bae 3/11/2026

EDMFormer: Genre-Specific Self-Supervised Learning for Music Structure Segmentation

Transformer model for electronic dance music structure segmentation using energy, rhythm, and timbre analysis instead of lyrical/harmonic similarity.

Ax Shaun Feakins, Ibrahim Habli, Phillip Morgan 3/11/2026

Clear, Compelling Arguments: Rethinking the Foundations of Frontier AI Safety Cases

Framework for creating structured safety arguments for frontier AI systems, adapting aerospace/automotive safety case methodologies.

Ax Sichen Yang (Johns Hopkins University), Mauro Maggioni (Johns Hopkins University) 3/11/2026

Multi-level meta-reinforcement learning with skill-based curriculum

Multi-level meta-reinforcement learning approach using skill-based curriculum for hierarchical sequential decision making and MDP compression.

Ax Shiheng Li, Jacob M. Miller, Phoebe J. Lee, Gustav Andersson, Christopher R. Conner, Yash J. Joshi, Bayan Karimi, Amber M. King, Howard L. Malc, Harsh Mishra, Hong Qiao, Minseok Ryu, Xuntao Wu, Siyuan Xing, Haoxiong Yan, Jian Shi, Andrew N. Cleland 3/11/2026

Large Language Model-Assisted Superconducting Qubit Experiments

Framework automating superconducting qubit experiment design and control sequences using LLMs.

Ax Tzafrir Rehan 3/11/2026

Test-Driven AI Agent Definition (TDAD): Compiling Tool-Using Agents from Behavioral Specifications

TDAD methodology for developing tool-using agents via behavioral specifications and automated testing, addressing production compliance.

Ax Piyush Gupta, Sangjae Bae, Jiachen Li, David Isele 3/11/2026

Scale-Plan: Scalable Language-Enabled Task Planning for Heterogeneous Multi-Robot Teams

LLM-based framework for scalable task planning in heterogeneous multi-robot systems using natural language.

Ax Saron Samuel, Alexander Martin, Eugene Yang, Andrew Yates, Dawn Lawrie, Ian Soborof, Laura Dietz, Benjamin Van Durme 3/11/2026

Beyond Relevance: On the Relationship Between Retrieval and RAG Information Coverage

Study examining relationship between retrieval quality and information coverage in RAG systems for report generation.

Ax Shijia Liao, Yuxuan Wang, Songting Liu, Yifan Cheng, Ruoyi Zhang, Tianyu Li, Shidong Li, Yisheng Zheng, Xingwei Liu, Qingzheng Wang, Zhizhuo Zhou, Jiahua Liu, Xin Chen, Dawei Han 3/11/2026

Fish Audio S2 Technical Report

Fish Audio S2: open-source text-to-speech system with instruction-following control via natural language descriptions.

Ax Jay Revolinsky, Harry Shomer, Jiliang Tang 3/11/2026

Are Expressive Encoders Necessary for Discrete Graph Generation?

GenGNN framework for graph generation achieving comparable performance to transformers with 2-5x faster inference.

Ax Mohammad Hossein Safarpour, Seyed Mohammad Alavi, Mohammad Izadikhah, Hossein Dibachi 3/11/2026

A New Modeling to Feature Selection Based on the Fuzzy Rough Set Theory in Normal and Optimistic States on Hybrid Information Systems

Feature selection method for hybrid information systems using fuzzy rough set theory for big data applications.

Ax Pratyay Kumar, Abu Saleh Md Tayeen, Satyajayant Misra, Huiping Cao, Jiefei Liu, Qixu Gong, Jayashree Harikumar 3/11/2026

NetDiffuser: Deceiving DNN-Based Network Attack Detection Systems with Diffusion-Generated Adversarial Traffic

Adversarial attack method using diffusion models to deceive deep learning-based network intrusion detection systems.

Ax Abhinaba Basu 3/11/2026

Cross-Domain Uncertainty Quantification for Selective Prediction: A Comprehensive Bound Ablation with Transfer-Informed Betting

Theoretical framework for selective prediction with risk control combining multiple concentration inequalities and betting-based confidence sequences.

Ax Daniel M. Jimenez-Gutierrez, Giovanni Giunta, Mehrdad Hassanzadeh, Aris Anagnostopoulos, Ioannis Chatzigiannakis, Andrea Vitaletti 3/11/2026

FedLECC: Cluster- and Loss-Guided Client Selection for Federated Learning under Non-IID Data

Federated learning technique optimizing client selection under non-IID data distribution for collaborative model training.