Isolater - Feed

Ax Jingjie Ning, Xueqi Li, Chengyu Yu 4/2/2026

Revision or Re-Solving? Decomposing Second-Pass Gains in Multi-LLM Pipelines

Controlled decomposition of multi-LLM revision pipelines to separate gains into re-solving, scaffold, and content components across benchmarks.

Ax Mona Schirmer, Anton Thielmann, Pola Schw\"obel, Thomas Martynec, Giuseppe Di Benedetto, Ben London, Yannik Stein 4/2/2026

Aligning Recommendations with User Popularity Preferences

Study on popularity bias in recommender systems and alignment with user preferences for popular vs niche content.

Ax Anubhab Sahu, Diptisha Samanta, Reza Soosahabi 4/2/2026

Automated Framework to Evaluate and Harden LLM System Instructions against Encoding Attacks

Automated framework to evaluate and harden LLM system instructions against encoding-based attacks to prevent credential and policy leakage.

Ax Deemah H. Tashman, Soumaya Cherkaoui 4/2/2026

Adversarial Attacks in AI-Driven RAN Slicing: SLA Violations and Recovery

Study of adversarial attacks targeting AI-driven radio access network slicing systems and recovery mechanisms.

Ax Ying Xie 4/2/2026

VibeGuard: A Security Gate Framework for AI-Generated Code

Security framework to detect and prevent vulnerabilities in AI-generated code through systematic verification of code safety gates.

Ax Awais Khan, Muhammad Umar Farooq, Kutub Uddin, Khalid Malik 4/2/2026

TRACE: Training-Free Partial Audio Deepfake Detection via Embedding Trajectory Analysis of Speech Foundation Models

Training-free detection method for partial audio deepfakes using speech foundation models without frame-level annotations.

Ax Anooshka Bajaj, Deven Mahesh Mistry, Sahaj Singh Maini, Yash Aggarwal, Billy Dickson, Zoran Tiganj 4/2/2026

Temporal Dependencies in In-Context Learning: The Role of Induction Heads

Analysis of how LLMs use induction heads to track and retrieve information from context, revealing serial-recall patterns in in-context learning.

Ax Jinzhao Li, Nan Jiang, Yexiang Xue 4/2/2026

Approximating Pareto Frontiers in Stochastic Multi-Objective Optimization via Hashing and Randomization

Algorithm for approximating Pareto frontiers in stochastic multi-objective optimization problems under uncertainty.

Ax Griffin Pitts, Neha Rani, Weedguet Mildort 4/2/2026

Trust and Reliance on AI in Education: AI Literacy and Need for Cognition as Moderators

Study on how students' trust in AI assistants affects their reliance and critical evaluation of AI-generated output in educational settings.

Ax Reyhaneh Ahani Manghotay (Simon Fraser University, Burnaby, Canada), Jie Liang (Eastern Institute of Technology, Ningbo, China) 4/2/2026

Lightweight Prompt-Guided CLIP Adaptation for Monocular Depth Estimation

Parameter-efficient adapter framework for adapting CLIP vision-language models to monocular depth estimation with minimal supervision.

Ax Atsuyuki Miyai, Mashiro Toyooka, Zaiying Zhao, Kenta Watanabe, Toshihiko Yamasaki, Kiyoharu Aizawa 4/2/2026

Paper Reconstruction Evaluation: Evaluating Presentation and Hallucination in AI-written Papers

PaperRecon evaluation framework for assessing quality and hallucination risks in AI-generated research papers from coding agents.

Ax Maofeng Tang, Hairong Qi 4/2/2026

Looking into a Pixel by Nonlinear Unmixing -- A Generative Approach

Generative approach for hyperspectral unmixing in remote sensing. Domain-specific to satellite imagery, not AI/LLM focused.

Ax Mohammad R. Abu Ayyash 4/2/2026

Brainstacks: Cross-Domain Cognitive Capabilities via Frozen MoE-LoRA Stacks for Continual LLM Learning

Brainstacks modular architecture for continual multi-domain LLM fine-tuning using MoE-LoRA stacks composing frozen adapters for domain expertise.

Ax Prantik Deb, Srimanth Dhondy, N. Ramakrishna, Anu Kapoor, Raju S. Bapi, Tapabrata Chakraborti 4/2/2026

AdaLoRA-QAT: Adaptive Low-Rank and Quantization-Aware Segmentation

AdaLoRA-QAT framework for chest X-ray segmentation using low-rank adaptation and quantization-aware training. Medical imaging domain, not AI/LLM focused.

Ax Cai Zhou, Zekai Wang, Menghua Wu, Qianyu Julie Zhu, Flora C. Shi, Chenyu Wang, Ashia Wilson, Tommi Jaakkola, Stephen Bates 4/2/2026

Online Reasoning Calibration: Test-Time Training Enables Generalizable Conformal LLM Reasoning

ORCA framework for test-time calibration of LLM reasoning using conformal prediction, improving efficiency of sampling-based scaling methods.

Ax Ken M. Nakanishi 4/2/2026

Screening Is Enough

Multiscreen architecture introducing explicit query-key relevance rejection mechanism in attention, improving LLM discrimination of irrelevant information.

Ax J. E. Dom\'inguez-Vidal 4/2/2026

A ROS 2 Wrapper for Florence-2: Multi-Mode Local Vision-Language Inference for Robotic Systems

ROS 2 middleware integration for Florence-2 vision-language model in robotics systems, enabling local inference for robotic perception.

Ax Nandan Thakur, Zijian Chen, Xueguang Ma, Jimmy Lin 4/2/2026

ORBIT: Scalable and Verifiable Data Generation for Search Agents on a Tight Budget

ORBIT dataset with 20K reasoning-intensive queries for training search agents combining LMs and web search, using verifiable generation methodology.

Ax Youssef Mroueh, Carlos Fonseca, Brian Belgodere, David Cox 4/2/2026

CliffSearch: Structured Agentic Co-Evolution over Theory and Code for Scientific Algorithm Discovery

Agentic evolutionary framework for scientific algorithm discovery combining LLM-guided search with structured theory and code co-evolution.

Ax Muyu He, Adit Jain, Anand Kumar, Vincent Tu, Soumyadeep Bakshi, Sachin Patro, Nazneen Rajani 4/2/2026

$\texttt{YC-Bench}$: Benchmarking AI Agents for Long-Term Planning and Consistent Execution

Benchmark for evaluating LLM agents on long-term planning over one-year startup simulation with hundreds of turns, testing strategic coherence under uncertainty.

Ax Piyush Garg, Diana R. Gergel, Andrew E. Shao, Galen J. Yacalis 4/2/2026

The Recipe Matters More Than the Kitchen:Mathematical Foundations of the AI Weather Prediction Pipeline

Mathematical framework analyzing AI weather prediction pipelines, emphasizing training methodology and data diversity over architecture choices.

Ax Yuxuan Bao, Xingyue Zhang, J. Nathan Kutz 4/2/2026

LAtent Phase Inference from Short time sequences using SHallow REcurrent Decoders (LAPIS-SHRED)

Spatio-temporal dynamics reconstruction from sparse observations using shallow recurrent decoders. Domain-specific to complex systems, not AI/ML focused.

Ax Bhrij Patel, Souradip Chakraborty, Mengdi Wang, Dinesh Manocha, Amrit Singh Bedi 4/2/2026

Code Comprehension then Auditing for Unsupervised LLM Evaluation

Method for unsupervised code correctness evaluation using LLMs through code comprehension before auditing, eliminating need for reference implementations.

Ax Aditi Singh, Abul Ehtesham, Saket Kumar, Tala Talaei Khoei, Athanasios V. Vasilakos 4/2/2026

Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG

Survey of agentic RAG systems combining LLMs with real-time retrieval to address static training data limitations and improve contextual accuracy.

Ax Matthew DosSantos DiSorbo, Harang Ju, Sinan Aral 4/2/2026

Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment

Research on fine-tuning LLMs as agentic systems to handle exceptions and improve decision-making in complex real-world contexts.

Ax Marco Valentino, Geonhee Kim, Dhairya Dalal, Zhixue Zhao, Andr\'e Freitas 4/2/2026

Mitigating Content Effects on Reasoning in Language Models through Fine-Grained Activation Steering

Study on mitigating reasoning biases in LLMs through activation steering at inference time to improve logical validity discrimination.

Ax Miho Koda, Yu Zheng, Ruixian Ma, Mingyang Sun, Devesh Pansare, Fabio Duarte, Paolo Santi 4/2/2026

LocationReasoner: Evaluating LLMs on Real-World Site Selection Reasoning

Research evaluating LLM reasoning capabilities on real-world site selection tasks, testing if models like o1 and DeepSeek-R1 generalize beyond math/code domains.

Ax Junxing Hu, Ai Han, Haolan Zhan, Pu Wei, Zhiqian Zhang, Yuhang Guo, Jiawei Lu, Zhen Chen, Haoran Li, Zicheng Zhang 4/2/2026

HiMA-Ecom: Enabling Joint Training of Hierarchical Multi-Agent E-commerce Assistants

Benchmark and framework for training hierarchical multi-agent LLM systems with master-coordinator and specialized sub-agents for e-commerce applications.

Ax Chenyu Zhou, Jingyuan Yang, Linwei Xin, Yitian Chen, Ziyan He, Dongdong Ge 4/2/2026

Auto-Formulating Dynamic Programming Problems with Large Language Models

Approach using LLMs to automate formulation of dynamic programming models for operations research, addressing stochastic transitions and data scarcity.

Ax Ammar Ahmed, Azal Ahmad Khan, Ayaan Ahmad, Sheng Di, Zirui Liu, Ali Anwar 4/2/2026

Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts

Retrieval-of-Thought method that reuses reasoning steps across problems via thought graphs to improve inference efficiency and reduce latency/cost.

Ax Boxuan Zhang, Yi Yu, Jiaxuan Guo, Jing Shao 4/2/2026

Dive into the Agent Matrix: A Realistic Evaluation of Self-Replication Risk in LLM Agents

Research on self-replication risks in LLM agents driven by objective misalignment, moving from theoretical concern to practical reality assessment.

Ax Zheng Zhang, Jiarui He, Yuchen Cai, Deheng Ye, Peilin Zhao, Ruili Feng, Hao Wang 4/2/2026

Genesis: Evolving Attack Strategies for LLM Web Agent Red-Teaming

Genesis: framework evolving attack strategies for red-teaming LLM web agents using behavioral pattern learning.

Ax Xiao Yang, Xuejiao Zhao, Zhiqi Shen 4/2/2026

EHRStruct: A Comprehensive Benchmark Framework for Evaluating Large Language Models on Structured Electronic Health Record Tasks

EHRStruct: benchmark framework evaluating LLM performance on structured electronic health record tasks with standardized metrics.

Ax Xiaohan Zhang, Tian Gao, Mingyue Cheng, Bokai Pan, Ze Guo, Yaguo Liu, Xiaoyu Tao, Qi Liu 4/2/2026