Isolater - Feed

Ax Samya Acharja, Kanchan Chowdhury 3/25/2026

Natural Language Interfaces for Spatial and Temporal Databases: A Comprehensive Overview of Methods, Taxonomy, and Future Directions

Surveys natural language interfaces to spatial and temporal databases, covering methods and taxonomy for NLIDBs with geospatial data.

Ax Michal Balcerak, Suprosana Shit, Chinmay Prabhakar, Sebastian Kaltenbach, Michael S. Albergo, Yilun Du, Bjoern Menze 3/25/2026

Graph Energy Matching: Transport-Aligned Energy-Based Modeling for Graph Generation

Proposes energy-based modeling for discrete graph generation using transport-aligned sampling to improve efficiency and quality.

Ax Zixiang Jiang, Yulun Zhang, Rishi Veerapaneni, Jiaoyang Li 3/25/2026

Planning over MAPF Agent Dependencies via Multi-Dependency PIBT

Extends Priority Inheritance with Backtracking (PIBT) algorithm for multi-agent path finding with multiple dependencies in congested environments.

Ax Yiqi Zhang, Huiqiang Jiang, Xufang Luo, Zhihe Yang, Chengruidong Zhang, Yifei Shen, Dongsheng Li, Yuqing Yang, Lili Qiu, Yang You 3/25/2026

SortedRL: Accelerating RL Training for LLMs through Online Length-Aware Scheduling

Proposes SortedRL, a length-aware scheduling method to accelerate RL training for LLMs, reducing rollout bottleneck in long chain-of-thought generation.

Ax Teerthaa Parakh, Karen M. Feigh 3/25/2026

Biased Error Attribution in Multi-Agent Human-AI Systems Under Delayed Feedback

Examines how humans attribute errors in multi-agent AI systems under delayed feedback, revealing biases in decision-making across sequential steps.

Ax Islam Debicha, Tayeb Kenaza, Ishak Charfi, Salah Mosbah, Mehdi Sehaki, Jean-Michel Dricot 3/25/2026

Targeted Adversarial Traffic Generation : Black-box Approach to Evade Intrusion Detection Systems in IoT Networks

Studies practical adversarial attack feasibility against ML-based IoT intrusion detection systems, addressing implementation constraints.

Ax Sabaat Haroon, Mohammad Taha Khan, Muhammad Ali Gulzar 3/25/2026

Evaluating LLM-Based Test Generation Under Software Evolution

Evaluates whether LLM-generated tests reflect genuine program understanding or superficial pattern reproduction, examining behavior under software evolution.

Ax Yiping Chen, Jinpeng Li, Wenyu Ke, Yang Luo, Jie Ouyang, Zhongjie He, Li Liu, Hongchao Fan, Hao Wu 3/25/2026

3DCity-LLM: Empowering Multi-modality Large Language Models for 3D City-scale Perception and Understanding

Proposes 3DCity-LLM, a multimodal LLM framework for 3D city-scale perception using coarse-to-fine feature encoding across object, relational, and global contexts.

Ax Yuntong Zhang, Zhiyuan Pan, Imam Nur Bani Yusuf, Haifeng Ruan, Ridwan Shariffdeen, Abhik Roychoudhury 3/25/2026

Code Review Agent Benchmark

Introduces benchmark dataset and evaluation framework for code review agents, addressing code quality assurance as AI-generated code scales.

Ax Duc Vu, Kien Nguyen, Trong-Tung Nguyen, Ngan Nguyen, Phong Nguyen, Khoi Nguyen, Cuong Pham, Anh Tran 3/25/2026

InverFill: One-Step Inversion for Enhanced Few-Step Diffusion Inpainting

Improves diffusion-based image inpainting through one-step inversion to reduce artifacts and sampling steps.

Ax Haoran Yuan, Weigang Yi, Zhenyu Zhang, Wendi Chen, Yuchen Mo, Jiashi Yin, Xinzhuo Li, Xiangyu Zeng, Chuan Wen, Cewu Lu, Katherine Driggs-Campbell, Ismini Lourentzou 3/25/2026

VTAM: Video-Tactile-Action Models for Complex Physical Interaction Beyond VLAs

Proposes VTAM, extending video-action models for embodied AI with tactile sensing for contact-rich physical interactions beyond vision-only approaches.

Ax Muhammad Khalid, Manuel Oriol, Yilmaz Uygun 3/25/2026

ReqFusion: A Multi-Provider Framework for Automated PEGS Analysis Across Software Domains

ReqFusion integrates multiple LLM providers (GPT, Claude, Groq) to automate software requirements extraction, classification, and analysis.

Ax Sagar Kumar, Ariel Flint, Luca Maria Aiello, Andrea Baronchelli 3/25/2026

Failure of contextual invariance in gender inference with large language models

Shows LLMs produce unstable outputs on gender inference tasks under minimal context variations, revealing dependence on cultural stereotypes in training data.

Ax Adrian Bulat, Alberto Baldrati, Ioannis Maniadis Metaxas, Yassine Ouali, Georgios Tzimiropoulos 3/25/2026

VISion On Request: Enhanced VLLM efficiency with sparse, dynamically selected, vision-language interactions

Proposes VISOR, a method to reduce inference costs in large vision-language models through dynamic, sparse vision-language interactions without information bottlenecks.

Ax Ufaq Khan, Umair Nawaz, L D M S S Teja, Numaan Saeed, Muhammad Bilal, Yutong Xie, Mohammad Yaqub, Muhammad Haris Khan 3/25/2026

MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage

Evaluates Vision Language Models' ability to perform pre-diagnostic sanity checks in medical imaging, identifying gaps between fluent text generation and safe visual understanding.

Ax Saleem Ahmed, Srirangaraj Setlur, Venu Govindaraju 3/25/2026

RealCQA-V2: A Diagnostic Benchmark for Structured Visual Entailment over Scientific Charts

RealCQA-V2 benchmark for evaluating multimodal reasoning on scientific chart understanding with visual entailment verification.

Ax Cecil Pang 3/25/2026

Toward Data Systems That Are Business Semantic Centric and AI Agents Assisted

BSDS system architecture integrating AI agents with data platforms for business-semantic-centric decision-making and workflows.

Ax Dadi Guo, Tianyi Zhou, Dongrui Liu, Chen Qian, Qihan Ren, Shuai Shao, Zhiyuan Fan, Yi R. Fung, Kun Wang, Linfeng Zhang, Jing Shao 3/25/2026

Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm

TRACE framework for self-evolving agent benchmarks that dynamically increase difficulty using test-time exploration and validation.

Ax Nan Huo, Xiaohan Xu, Jinyang Li, Per Jacobsson, Shipei Lin, Bowen Qin, Binyuan Hui, Xiaolong Li, Ge Qu, Shuzheng Si, Linheng Han, Edward Alexander, Xintong Zhu, Rui Qin, Ruihan Yu, Yiyao Jin, Feige Zhou, Weihao Zhong, Yun Chen, Hongyu Liu, Chenhao Ma, Fatma Ozcan, Yannis Papakonstantinou, Reynold Cheng 3/25/2026

BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions

BIRD-INTERACT benchmark evaluating LLMs on multi-turn text-to-SQL tasks with dynamic interactions and error handling.

Ax Raj Ghugare, Roger Creus Castanyer, Catherine Ji, Kathryn Wantlin, Jin Schofield, Karthik Narasimhan, Benjamin Eysenbach 3/25/2026

BuilderBench: The Building Blocks of Intelligent Agents

BuilderBench benchmark for evaluating AI agents' ability to learn through exploration and interaction beyond training data patterns.

Ax V\'it R\r{u}\v{z}i\v{c}ka, Gonzalo Mateo-Garc\'ia, Itziar Irakulis-Loitxate, Juan Emmanuel Johnson, Manuel Montesino San Mart\'in, Anna Allen, Alma Raunak, Carol Castaneda, Luis Guanter, David R. Thompson 3/25/2026

Operational machine learning for remote spectroscopic detection of CH$_{4}$ point sources

ML system for detecting methane emissions from satellite spectroscopy data, addressing false detections in environmental monitoring.

Ax Yue Zhong, Yongju Tong, Jiawen Kang, Minghui Dai, Hong-Ning Dai, Zhou Su, Dusit Niyato 3/25/2026

Hybrid Stackelberg Game and Diffusion-based Auction for Two-tier Agentic AI Task Offloading in Internet of Agents

Hybrid Stackelberg game and diffusion-based auction mechanism for task offloading among collaborative AI agents in Internet of Agents.

Ax Abhishek Kumar, Riya Tapwal, Carsten Maple 3/25/2026

DriveSafe: A Hierarchical Risk Taxonomy for Safety-Critical LLM-Based Driving Assistants

Domain-specific risk taxonomy and evaluation framework for LLM-based driving assistants addressing safety-critical scenarios.

Ax Zeping Li, Hongru Wang, Yiwen Zhao, Guanhua Chen, Yixia Li, Keyang Chen, Yixin Cao, Guangnan Ye, Hongfeng Chai, Zhenfei Yin 3/25/2026

Rethinking the Role of Entropy in Optimizing Tool-Use Behaviors for Large Language Model Agents

Entropy-based analysis shows reducing entropy improves tool-use behavior in LLM agents, reducing excessive tool calls and latency.

Ax Xun Huang, Simeng Qin, Xiaoshuang Jia, Ranjie Duan, Huanqian Yan, Zhitao Zeng, Fei Yang, Yang Liu, Xiaojun Jia 3/25/2026

Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search

Classical Chinese jailbreak prompts bypass LLM safety constraints more effectively than English due to obscurity and conciseness.

Ax Hyungyung Lee, Hangyul Yoon, Edward Choi 3/25/2026

CXReasonAgent: Evidence-Grounded Diagnostic Reasoning Agent for Chest X-rays

Evidence-grounded diagnostic reasoning agent using vision-language models for chest X-ray interpretation.

Ax Sivaram Pothireddypalli, Ashish Raman, Deepak Narayan Gadde, Aman Kumar 3/25/2026

Agentic AI-based Coverage Closure for Formal Verification

LLM-enabled agentic workflow automating coverage analysis and gap identification for IC formal verification.

Ax Kenny Ye Liang, Zhongyi Pei, Huan Zhang, Yuhui Liu, Shaoxu Song, Jianmin Wang 3/25/2026

Retrieval-Augmented Generation with Covariate Time Series

Extends RAG paradigm to time-series foundation models for predictive maintenance with covariate dynamics.

Ax Giacomo Rosa, Jean Honorio, Nir Lipovetzky, Sebastian Sardina 3/25/2026

Planning as Goal Recognition: Deriving Heuristics from Intention Models -- Extended Version

Adopts goal recognition heuristics for classical planning problems to improve plan search prioritization.

Ax Davide Di Gioia 3/25/2026

Cascade-Aware Multi-Agent Routing: Spatio-Temporal Sidecars and Geometry-Switching

Multi-agent routing system aware of cascading failures in tree versus cyclic graph topologies with geometry-switching.

Ax Ruixiang Liu, Zhenlong Li, Ali Khosravi Kazazi 3/25/2026

Towards Intelligent Geospatial Data Discovery: a knowledge graph-driven multi-agent framework powered by large language models

Knowledge graph-driven multi-agent LLM framework for semantic geospatial data discovery with improved retrieval.

Ax Sheng Liu, Long Chen, Zeyun Zhao, Qinglin Gou, Qingyue Wei, Arjun Masurkar, Kevin M. Spiegler, Philip Kuball, Stefania C. Bray, Megan Bernath, Deanna R. Willis, Jiang Bian, Lei Xing, Eric Topol, Kyunghyun Cho, Yu Huang, Ruogu Fang, Narges Razavian, James Zou 3/25/2026

Cerebra: A Multidisciplinary AI Board for Multimodal Dementia Characterization and Risk Assessment

Cerebra: multi-agent AI system with specialized agents for EHR, clinical notes, and multimodal data in dementia assessment.

Ax Jonas Oppenlaender, Joonas H\"am\"al\"ainen 3/25/2026

Mapping the Challenges of HCI: An Application and Evaluation of ChatGPT for Mining Insights at Scale

Evaluates ChatGPT (GPT-3.5/4) effectiveness on extracting research challenges from HCI literature at scale using two-step approach.

Ax Yunni Qu (Department of Computer Science, University of North Carolina at Chapel Hill), Bhargav Vaduri (Department of Computer Science, University of North Carolina at Chapel Hill), Karthikeya Jatoth (Department of Computer Science, University of North Carolina at Chapel Hill), James Wellnitz (Eshelman School of Pharmacy, University of North Carolina at Chapel Hill), Dzung Dinh (Department of Computer Science, University of North Carolina at Chapel Hill), Seth Veenbaas (Eshelman School of Pharmacy, University of North Carolina at Chapel Hill), Jonathan Chapman (Eshelman School of Pharmacy, University of North Carolina at Chapel Hill), Alexander Tropsha (Eshelman School of Pharmacy, University of North Carolina at Chapel Hill), Junier Oliva (Department of Computer Science, University of North Carolina at Chapel Hill) 3/25/2026