Isolater - Feed

Ax J. E. Dom\'inguez-Vidal 4/2/2026

A ROS 2 Wrapper for Florence-2: Multi-Mode Local Vision-Language Inference for Robotic Systems

ROS 2 middleware integration for Florence-2 vision-language model in robotics systems, enabling local inference for robotic perception.

Ax Nandan Thakur, Zijian Chen, Xueguang Ma, Jimmy Lin 4/2/2026

ORBIT: Scalable and Verifiable Data Generation for Search Agents on a Tight Budget

ORBIT dataset with 20K reasoning-intensive queries for training search agents combining LMs and web search, using verifiable generation methodology.

Ax Youssef Mroueh, Carlos Fonseca, Brian Belgodere, David Cox 4/2/2026

CliffSearch: Structured Agentic Co-Evolution over Theory and Code for Scientific Algorithm Discovery

Agentic evolutionary framework for scientific algorithm discovery combining LLM-guided search with structured theory and code co-evolution.

Ax Muyu He, Adit Jain, Anand Kumar, Vincent Tu, Soumyadeep Bakshi, Sachin Patro, Nazneen Rajani 4/2/2026

$\texttt{YC-Bench}$: Benchmarking AI Agents for Long-Term Planning and Consistent Execution

Benchmark for evaluating LLM agents on long-term planning over one-year startup simulation with hundreds of turns, testing strategic coherence under uncertainty.

Ax Piyush Garg, Diana R. Gergel, Andrew E. Shao, Galen J. Yacalis 4/2/2026

The Recipe Matters More Than the Kitchen:Mathematical Foundations of the AI Weather Prediction Pipeline

Mathematical framework analyzing AI weather prediction pipelines, emphasizing training methodology and data diversity over architecture choices.

Ax Yuxuan Bao, Xingyue Zhang, J. Nathan Kutz 4/2/2026

LAtent Phase Inference from Short time sequences using SHallow REcurrent Decoders (LAPIS-SHRED)

Spatio-temporal dynamics reconstruction from sparse observations using shallow recurrent decoders. Domain-specific to complex systems, not AI/ML focused.

Ax Bhrij Patel, Souradip Chakraborty, Mengdi Wang, Dinesh Manocha, Amrit Singh Bedi 4/2/2026

Code Comprehension then Auditing for Unsupervised LLM Evaluation

Method for unsupervised code correctness evaluation using LLMs through code comprehension before auditing, eliminating need for reference implementations.

Ax Aditi Singh, Abul Ehtesham, Saket Kumar, Tala Talaei Khoei, Athanasios V. Vasilakos 4/2/2026

Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG

Survey of agentic RAG systems combining LLMs with real-time retrieval to address static training data limitations and improve contextual accuracy.

Ax Matthew DosSantos DiSorbo, Harang Ju, Sinan Aral 4/2/2026

Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment

Research on fine-tuning LLMs as agentic systems to handle exceptions and improve decision-making in complex real-world contexts.

Ax Marco Valentino, Geonhee Kim, Dhairya Dalal, Zhixue Zhao, Andr\'e Freitas 4/2/2026

Mitigating Content Effects on Reasoning in Language Models through Fine-Grained Activation Steering

Study on mitigating reasoning biases in LLMs through activation steering at inference time to improve logical validity discrimination.

Ax Miho Koda, Yu Zheng, Ruixian Ma, Mingyang Sun, Devesh Pansare, Fabio Duarte, Paolo Santi 4/2/2026

LocationReasoner: Evaluating LLMs on Real-World Site Selection Reasoning

Research evaluating LLM reasoning capabilities on real-world site selection tasks, testing if models like o1 and DeepSeek-R1 generalize beyond math/code domains.

Ax Junxing Hu, Ai Han, Haolan Zhan, Pu Wei, Zhiqian Zhang, Yuhang Guo, Jiawei Lu, Zhen Chen, Haoran Li, Zicheng Zhang 4/2/2026

HiMA-Ecom: Enabling Joint Training of Hierarchical Multi-Agent E-commerce Assistants

Benchmark and framework for training hierarchical multi-agent LLM systems with master-coordinator and specialized sub-agents for e-commerce applications.

Ax Chenyu Zhou, Jingyuan Yang, Linwei Xin, Yitian Chen, Ziyan He, Dongdong Ge 4/2/2026

Auto-Formulating Dynamic Programming Problems with Large Language Models

Approach using LLMs to automate formulation of dynamic programming models for operations research, addressing stochastic transitions and data scarcity.

Ax Ammar Ahmed, Azal Ahmad Khan, Ayaan Ahmad, Sheng Di, Zirui Liu, Ali Anwar 4/2/2026

Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts

Retrieval-of-Thought method that reuses reasoning steps across problems via thought graphs to improve inference efficiency and reduce latency/cost.

Ax Boxuan Zhang, Yi Yu, Jiaxuan Guo, Jing Shao 4/2/2026

Dive into the Agent Matrix: A Realistic Evaluation of Self-Replication Risk in LLM Agents

Research on self-replication risks in LLM agents driven by objective misalignment, moving from theoretical concern to practical reality assessment.

Ax Zheng Zhang, Jiarui He, Yuchen Cai, Deheng Ye, Peilin Zhao, Ruili Feng, Hao Wang 4/2/2026

Genesis: Evolving Attack Strategies for LLM Web Agent Red-Teaming

Genesis: framework evolving attack strategies for red-teaming LLM web agents using behavioral pattern learning.

Ax Xiao Yang, Xuejiao Zhao, Zhiqi Shen 4/2/2026

EHRStruct: A Comprehensive Benchmark Framework for Evaluating Large Language Models on Structured Electronic Health Record Tasks

EHRStruct: benchmark framework evaluating LLM performance on structured electronic health record tasks with standardized metrics.

Ax Xiaohan Zhang, Tian Gao, Mingyue Cheng, Bokai Pan, Ze Guo, Yaguo Liu, Xiaoyu Tao, Qi Liu 4/2/2026

Alphacast: An Interaction-Driven Agentic Reasoning Framework for Cognition-Inspired Time Series Forecasting

Alphacast: agentic reasoning framework for time series forecasting using iterative multi-step reasoning with domain knowledge integration.

Ax Guanzhi Deng, Bo Li, Ronghao Chen, Xiujin Liu, Zhuo Han, Huacan Wang, Lijie Wen, Linqi Song 4/2/2026

DR-LoRA: Dynamic Rank LoRA for Fine-Tuning Mixture-of-Experts Models

DR-LoRA: parameter-efficient fine-tuning method for MoE LLMs using dynamic rank allocation based on expert specialization.

Ax Shuliang Liu, Xingyu Li, Hongyi Liu, Dong Fang, Yibo Yan, Bingchen Duan, Qi Zheng, Lingfeng Su, Xuming Hu 4/2/2026

Distilling the Thought, Watermarking the Answer: A Principle Semantic Guided Watermark for Large Reasoning Models

ReasonMa: semantic-guided watermarking technique for reasoning LLMs that preserves logical coherence while protecting models.

Ax David Hud\'ak, Maris F. L. Galesloot, Martin Tappler, Martin Kure\v{c}ka, Nils Jansen, Milan \v{C}e\v{s}ka 4/2/2026

Finite-State Controllers for (Hidden-Model) POMDPs using Deep Reinforcement Learning

arXiv paper on Lexpop framework using deep RL to train finite-state controllers for solving POMDPs robustly.

Ax Bj\"orn Hoppmann, Christoph Scholz 4/2/2026

Meta-Learning and Meta-Reinforcement Learning -- Tracing the Path towards DeepMind's Adaptive Agent

Survey on meta-learning and meta-reinforcement learning enabling rapid adaptation to novel tasks with minimal data.

Ax Jonas Karge 4/2/2026

Epistemic Filtering and Collective Hallucination: A Jury Theorem for Confidence-Calibrated Agents

arXiv paper on heterogeneous agent collective accuracy using calibration and selective abstention in voting systems.

Ax Julia Jose, Ritik Roongta, Rachel Greenstadt 4/2/2026

When Agents Persuade: Rhetoric Generation and Mitigation in LLMs

Analysis of LLM-based agents' capability to generate propaganda and rhetorical manipulation, with detection of techniques like loaded language and appeals to fear.

Ax Vasily Ilin 4/2/2026

Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau Equilibrium

AI-assisted formalization of Vlasov-Maxwell-Landau system equilibrium in Lean 4 using DeepThink reasoning and Claude Code agent for automated theorem proving.

Ax Yi Nian, Haosen Cao, Shenzhe Zhu, Henry Peng Zou, Qingqing Luan, Yue Zhao 4/2/2026

When Only the Final Text Survives: Implicit Execution Tracing for Multi-Agent Attribution

Attribution method for multi-agent systems that identifies responsible agents without execution logs by analyzing final text only, addressing privacy-constrained scenarios.

Ax Chung-En Johnny Yu, Brian Jalaian, Nathaniel D. Bastian 4/2/2026

SCoOP: Semantic Consistent Opinion Pooling for Uncertainty Quantification in Multiple Vision-Language Model Systems

Training-free uncertainty quantification framework for combining multiple vision-language models through semantic-consistent opinion pooling to reduce hallucinations.

Ax Zehua Han, Jing Xiao, Yiqi Duan, Mengyu Xiang, Yuheng Ji, Xiaolong Zheng, Chenghanyu Zhang, Zhendong She, Junyu Shen, Dingwei Tan, Shichu Sun, Zhou Cong, Mingxuan Liu, Fengxiang Wang, Jinping Sun, Yangang Sun 4/2/2026

PReD: An LLM-based Foundation Multimodal Model for Electromagnetic Perception, Recognition, and Decision

Foundation multimodal model for electromagnetic domain covering perception, recognition, and decision-making using LLM capabilities adapted for domain-specific applications.

Ax Lvmin Zhang, Maneesh Agrawala 4/2/2026

View-oriented Conversation Compiler for Agent Trace Analysis

Compiler for analyzing and visualizing structured agent traces including nested tool calls, reasoning blocks, and sub-agent invocations for better agentic system understanding.

Ax Davide Di Gioia 4/2/2026

Cognitive Friction: A Decision-Theoretic Framework for Bounded Deliberation in Tool-Using Agents

Decision-theoretic framework (Triadic Cognitive Architecture) for tool-using agents that bounds information-acquisition costs and tool usage to prevent systematic failures.

Ax Manuel Serra Nunes, Atabak Dehban, Yiannis Demiris, Jos\'e Santos-Victor 4/2/2026

Ego-Foresight: Self-supervised Learning of Agent-Aware Representations for Improved RL

Self-supervised learning method for RL agents that models agent and environment separately to improve sample efficiency without requiring supervisory signals.

Ax Benoit Coqueret, Mathieu Carbone, Olivier Sentieys, Gabriel Zaid 4/2/2026

A Divide-and-Conquer Strategy for Hard-Label Extraction of Deep Neural Networks via Side-Channel Attacks

Demonstrates hard-label extraction of deep neural networks via side-channel attacks using divide-and-conquer strategy for DNN intellectual property theft.

Ax Luigi Celona, Simone Bianco, Paolo Napoletano 4/2/2026

Cross-Camera Distracted Driver Classification through Feature Disentanglement and Contrastive Learning

Addresses accuracy loss in distracted driver classification across camera conditions using feature disentanglement and contrastive learning for robustness.

Ax Johnny Chan, Yuming Li 4/2/2026

Enhancing Team Diversity with Generative AI: A Novel Project Management Framework

Project management framework using generative AI agents to address team composition gaps by matching sociologically identified personality patterns and roles.

Ax Na Min An, Eunki Kim, Wan Ju Kang, Sangryul Kim, James Thorne, Hyunjung Shim 4/2/2026

How Blind and Low-Vision Individuals Prefer Large Vision-Language Model-Generated Scene Descriptions

User study with blind and low-vision participants evaluating preferences for LVLM-generated scene descriptions, examining effectiveness and user preferences.

Ax Jialuo Li, Wenhao Chai, Xingyu Fu, Haiyang Xu, Saining Xie 4/2/2026

Science-T2I: Addressing Scientific Illusions in Image Synthesis

ScienceT2I dataset and benchmark evaluating scientific correctness in image synthesis, addressing gap between visual fidelity and physical realism across 16 scientific domains.

Ax Carlos Rodriguez-Pardo, Leonardo Chiani, Emanuele Borgonovo, Massimo Tavoni 4/2/2026

Neural Conditional Transport Maps

Neural framework for learning conditional optimal transport maps with hypernetworks that generate adaptive transport parameters for categorical and continuous variables.

Ax Leon Eshuijs, Archie Chaudhury, Alan McBeth, Ethan Nguyen 4/2/2026

But what is your honest answer? Aiding LLM-judges with honest alternatives using steering vectors

JUSSA framework uses steering vectors to improve LLM-as-judge reliability by detecting and mitigating subtle dishonesty like sycophancy through contrastive alternatives.

Ax Alejandro Murillo-Gonzalez, Lantao Liu 4/2/2026

Situationally-Aware Dynamics Learning

Framework for online learning of hidden state representations in autonomous robots to handle unobserved factors in complex, unstructured environments.

Ax Chunyang Jiang, Chi-min Chan, Yiyang Cai, Yulong Liu, Wei Xue, Yike Guo 4/2/2026

Graceful Forgetting in Generative Language Models

Proposes graceful forgetting methods to mitigate negative transfer by selectively forgetting detrimental pre-training knowledge during fine-tuning of language models.

Ax Shimao Zhang, Zhejian Lai, Xiang Liu, Shuaijie She, Xiao Liu, Yeyun Gong, Shujian Huang, Jiajun Chen 4/2/2026

How Does Alignment Enhance LLMs' Multilingual Capabilities? A Language Neurons Perspective

Analyzes language-specific neurons to understand how multilingual alignment transfers capabilities from high-resource to low-resource languages in LLMs.

Ax Ananthu Aniraj, Cassio F. Dantas, Dino Ienco, Diego Marcos 4/2/2026

Two-stage Vision Transformers and Hard Masking offer Robust Object Representations

Two-stage vision transformer with hard masking approach for robust object representations that balance context dependence with distribution shift robustness.

Ax Kellie Yu Hui Sim, Roy Ka-Wei Lee, Kenny Tsu Wei Choo 4/2/2026

"Is This Really a Human Peer Supporter?": Misalignments Between Peer Supporters and Experts in LLM-Supported Interactions

Investigates misalignments between LLM-supported peer supporters and mental health experts, examining quality and safety concerns in AI-driven psychosocial support.

Ax Hexiang Gu, Qifan Yu, Yuan Liu, Zikang Li, Saihui Hou, Jian Zhao, Zhaofeng He 4/2/2026