Isolater - Feed

Ax Md Tanvirul Alam, Dipkamal Bhusal, Salman Ahmad, Nidhi Rastogi, Peter Worth 2/17/2026

AthenaBench: A Dynamic Benchmark for Evaluating LLMs in Cyber Threat Intelligence

ArXiv paper introducing AthenaBench, a benchmark for evaluating LLMs on cyber threat intelligence tasks including analysis of unstructured security reports.

Ax Jiahui Gao, Kuang Zhou, Yuchen Zhu, Keyu Wu 2/17/2026

Importance Ranking in Complex Networks via Influence-aware Causal Node Embedding

Influence-aware causal node embedding method for quantifying node importance in complex networks, applicable to influence maximization and network analysis.

Ax Tung-Long Vuong, Julien Monteil, Hien Dang, Volodymyr Vaskovych, Trung Le, Vu Nguyen 2/17/2026

On the Mechanisms of Collaborative Learning in VAE Recommenders

Theoretical analysis of collaborative learning in VAE-based recommender systems, showing latent proximity governs how binary masking improves performance.

Ax Mohammad Afzal, S. Akshay, Blaise Genest, Ashutosh Gupta 2/17/2026

Formal Reasoning About Confidence and Automated Verification of Neural Networks

Framework for formally reasoning about confidence and robustness in neural networks, generalizing existing adversarial robustness verification approaches.

Ax Chanakya Ekbote, Vijay Lingam, Sujay Sanghavi, Jun Huan, Behrooz Omidvar-Tehrani, Anoop Deoras, Stefano Soatto 2/17/2026

MURPHY: Multi-Turn GRPO for Self Correcting Code Generation

MURPHY: Multi-turn reinforcement learning framework for self-correcting code generation combining group relative policy optimization with execution verification.

Ax Chi-Yu Chen, Rawan Abulibdeh, Arash Asgari, Sebasti\'an Andr\'es Cajas Ord\'o\~nez, Leo Anthony Celi, Deirdre Goode, Hassan Hamidi, Laleh Seyyed-Kalantari, Ned McCague, Thomas Sounack, Po-Chih Kuo 2/17/2026

Algorithms Trained on Normal Chest X-rays Can Predict Health Insurance Types

Study showing state-of-the-art vision models trained on normal chest X-rays can predict patient health insurance type, revealing encoded socioeconomic bias.

Ax Bidipta Sarkar, Mattie Fellows, Juan Agustin Duque, Alistair Letcher, Antonio Le\'on Villares, Anya Sims, Clarisse Wibault, Dmitry Samsonov, Dylan Cope, Jarek Liesen, Kang Li, Lukas Seier, Theo Wolf, Uljad Berdica, Valentin Mohl, Alexander David Goldie, Aaron Courville, Karin Sevegnani, Shimon Whiteson, Jakob Nicolaus Foerster 2/17/2026

Evolution Strategies at the Hyperscale

EGGROLL: Scalable evolution strategies algorithm using low-rank approximations to improve training efficiency of black-box optimization on GPUs.

Ax Yuepeng Sheng, Yuwei Huang, Shuman Liu, Anxiang Zeng, Haibo Zhang 2/17/2026

ESPO: Entropy Importance Sampling Policy Optimization

ESPO: Entropy importance sampling policy optimization for stable and efficient token-level RL training of LLMs on complex reasoning tasks at scale.

Ax Cong Wang, Changfeng Gao, Yang Xiang, Zhihao Du, Keyu An, Han Zhao, Qian Chen, Xiangang Li, Yingming Gao, Ya Li 2/17/2026

RRPO: Robust Reward Policy Optimization for LLM-based Emotional TTS

RRPO: Robust reward policy optimization framework preventing reward hacking in LLM-based emotional text-to-speech by addressing vulnerability of vanilla reward models.

Ax Daeyong Kwon, SeungHeon Doh, Juhan Nam 2/17/2026

ArtistMus: A Globally Diverse, Artist-Centric Benchmark for Retrieval-Augmented Music Question Answering

ArtistMus: Benchmark dataset for retrieval-augmented music question answering grounded in artist metadata to evaluate LLMs on music-related reasoning tasks.

Ax Hao Chen, Rui Yin, Yifan Chen, Qi Chen, Chao Li 2/17/2026

Learning Patient-Specific Disease Dynamics with Latent Flow Matching for Longitudinal Imaging Generation

Latent flow matching method for modeling continuous disease progression from longitudinal medical imaging to enable early diagnosis and personalized treatment planning.

Ax Jeongjun Park, Sunwook Hwang, Hyeonho Noh, Jin Mo Yang, Hyun Jong Yang, Saewoong Bahk 2/17/2026

ALERT Open Dataset and Input-Size-Agnostic Vision Transformer for Driver Activity Recognition using IR-UWB

ALERT dataset and input-size-agnostic Vision Transformer for driver activity recognition using IR-UWB radar to detect distracted driving behaviors.

Ax Kaustav Chatterjee, Joshua Li, Kundan Parajulee, Jared Schwennesen 2/17/2026

Network Level Evaluation of Hangup Susceptibility of HRGCs using Deep Learning and Sensing Techniques: A Goal Towards Safer Future

Deep learning framework for network-level evaluation of vehicle hang-up susceptibility at highway-railway grade crossings using laser imaging and sensor data.

Ax Vivan Doshi, Mengyuan Li 2/17/2026

Writing in Symbiosis: Mapping Human Creative Agency in the AI Era

Qualitative study examining patterns of human-AI coevolution in creative writing and how human agency adapts alongside machine capabilities.

Ax Yunhao Yao, Zhiqiang Wang, Haoran Cheng, Yihang Cheng, Haohua Du, Xiang-Yang Li 2/17/2026

IntentMiner: Intent Inversion Attack via Tool Call Analysis in the Model Context Protocol

IntentMiner: Privacy attack exploiting Model Context Protocol servers to extract user intents from LLM tool calls, revealing new security vulnerabilities in agentic AI systems.

Ax Jingli Liu, Huannan Zheng, Bohao Zou, Kezhou Yang 2/17/2026

Emergent human-like working memory from artificial neurons with intrinsic plasticity

IPNet: Neuromorphic architecture using magnetic tunnel junction intrinsic plasticity to implement human-like working memory with reduced energy costs.

Ax Matthieu Mastio, Paul Saves, Benoit Gaudou, Nicolas Verstaevel 2/17/2026

Adaptive Agents in Spatial Double-Auction Markets: Modeling the Emergence of Industrial Symbiosis

Agent-based model simulating adaptive firm behavior in spatial double-auction markets to understand emergence of industrial symbiosis under socio-spatial constraints.

Ax Nilesh Jain, Hyungil Suh, Seyi Adeyinka, Leor Roseman, Aza Allsop 2/17/2026

Multi-LLM Thematic Analysis with Dual Reliability Metrics: Combining Cohen's Kappa and Semantic Similarity for Qualitative Research Validation

Multi-LLM validation framework for thematic analysis combining Cohen's Kappa and semantic similarity metrics to improve reliability of LLM-based qualitative research coding.

Ax Rui Li, Zhaoning Zhang, Libo Zhang, Huaimin Wang, Xiang Fu, Zhiquan Lai 2/17/2026

Nightjar: Dynamic Adaptive Speculative Decoding for Large Language Models Serving

Nightjar: Dynamic adaptive speculative decoding method that adjusts verification overhead based on request load to optimize LLM inference throughput and latency.

Ax Meili Sun, Chunjiang Zhao, Lichao Yang, Hao Liu, Shimin Hu, Ya Xiong 2/17/2026

Vision-Based Early Fault Diagnosis and Self-Recovery for Strawberry Harvesting Robots

Visual fault diagnosis framework for strawberry harvesting robots using multi-task learning to address gripper misalignment and grasping failures.

Ax Almaz Ermilov 2/17/2026

FormationEval, an open multiple-choice benchmark for petroleum geoscience

FormationEval: 505-question multiple-choice benchmark for evaluating LLMs on petroleum geoscience topics like petrophysics and reservoir engineering.

Ax David Samuel Setiawan, Rapha\"el Merx, Jey Han Lau 2/17/2026

Context Volume Drives Performance: Tackling Domain Shift in Extremely Low-Resource Translation via RAG

Retrieval-augmented generation approach addressing domain shift in low-resource neural machine translation using context volume from limited corpora.

Ax Wang Zixian 2/17/2026

Orthogonalized Policy Optimization:Decoupling Sampling Geometry from Optimization Geometry in RLHF

Unified framework for LLM alignment decoupling sampling and optimization geometry across PPO, DPO, IPO algorithms and variants.

Ax Asif Mohammed Samir, Mohammad Masudur Rahman 2/17/2026

Improved Bug Localization with AI Agents Leveraging Hypothesis and Dynamic Cognition

LLM-based AI agents with hypothesis-driven cognition for improved software bug localization by analyzing code component relationships.

Ax V\'ictor Yeste, Paolo Rosso 2/17/2026

Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum

Multi-label classification of Schwartz human values in single sentences using transformer ensembles on political and news text corpora.

Ax Kevin Tseng, Juan Carlos Toledano, Bart De Clerck, Yuliia Dukach, Phil Tinn 2/17/2026

An Agentic Operationalization of DISARM for FIMI Investigation on Social Media

Agentic operationalization of DISARM framework for investigating foreign information manipulation on social media across NATO allied partners.

Ax Fabi Nahian Madhurja, Rusab Sarmun, Muhammad E. H. Chowdhury, Adam Mushtak, Israa Al-Hashimi, Sohaib Bassam Zoghoul 2/17/2026

Tracing 3D Anatomy in 2D Strokes: A Multi-Stage Projection Driven Approach to Cervical Spine Fracture Identification

Multi-stage approach using 2D projections for automated cervical spine fracture detection in 3D CT volumes with vertebra-level analysis.

Ax Andy Zhu, Rongzhe Wei, Yupu Gu, Pan Li 2/17/2026

GRIP: Algorithm-Agnostic Machine Unlearning for Mixture-of-Experts via Geometric Router Constraints

Machine unlearning method for Mixture-of-Experts LLMs using geometric router constraints to erase knowledge rather than redirect queries.

Ax Hansheng Ren 2/17/2026

From Fuzzy to Exact: The Halo Architecture for Infinite-Depth Reasoning via Rational Arithmetic

Alternative LLM architecture using rational arithmetic instead of floating-point to enable infinite-depth reasoning without structural heuristics.

Ax Ling Tang, Jilin Mei, Dongrui Liu, Chen Qian, Dawei Cheng, Jing Shao, Xia Hu 2/17/2026

Interpreting Emergent Extreme Events in Multi-Agent Systems

Framework for interpreting and explaining emergent extreme events in LLM-powered multi-agent systems to improve safety and transparency.

Ax Jonas H\"ubotter, Frederike L\"ubeck, Lejs Behric, Anton Baumann, Marco Bagatella, Daniel Marta, Ido Hakimi, Idan Shenfeld, Thomas Kleine Buening, Carlos Guestrin, Andreas Krause 2/17/2026

Reinforcement Learning via Self-Distillation

Self-distillation approach for reinforcement learning leveraging rich textual feedback from verifiable environments to improve credit assignment in code/math tasks.

Ax Francisco Caldas, Sahil Kumar, Cl\'audia Soares 2/17/2026

A Decomposable Forward Process in Diffusion Models for Time-Series Forecasting

Model-agnostic diffusion process decomposing time-series signals into spectral components to preserve temporal patterns like seasonality.

Ax Tao Yu, Haopeng Jin, Hao Wang, Shenghua Chai, Yujia Yang, Junhao Gong, Jiaming Guo, Minghui Zhang, Xinlong Chen, Zhenghao Zhang, Yuxuan Zhou, Yufei Xiong, Shanbin Zhang, Jiabing Yang, Hongzhu Yi, Xinming Wang, Cheng Zhong, Xiao Ma, Zhang Zhang, Yan Huang, Liang Wang 2/17/2026

ShotFinder: Imagination-Driven Open-Domain Video Shot Retrieval via Web Search

Benchmark for open-domain video shot retrieval using LLMs for understanding editing requirements and retrieving keyframe-oriented shots.

Ax Louis Schiekiera, Max Zimmer, Christophe Roux, Sebastian Pokutta, Fritz G\"unther 2/17/2026

From Associations to Activations: Comparing Behavioral and Hidden-State Semantic Geometry in LLMs

Analyzing semantic geometry in LLM hidden states versus behavioral similarity through psycholinguistic experiments across eight instruction-tuned models.

Ax Kai Yuan, Anthony Zheng, Jia Hu, Divyanshu Sheth, Hemanth Velaga, Kylee Kim, Matteo Guarrera, Besim Avci, Jianhua Li, Xuetao Yin, Rajyashree Mukherjee, Sean Suchter 2/17/2026

Unifying Ranking and Generation in Query Auto-Completion via Retrieval-Augmented Generation and Multi-Objective Alignment

Unified retrieval-augmented generation framework for query auto-completion combining ranking and generation to reduce hallucination and improve coverage.

Ax Jiaming Cui, Wenqiang Li, Shuai Zhou, Ruifeng Qin, Feng Shen 2/17/2026

Cross-Modal Purification and Fusion for Small-Object RGB-D Transmission-Line Defect Detection

Cross-modal fusion network for detecting small-scale defects in transmission lines using RGB-D imagery from UAV inspection.

Ax Abhijit Gupta 2/17/2026

Cardinality-Preserving Attention Channels for Graph Transformers in Molecular Property Prediction

Graph transformer with cardinality-preserving attention for molecular property prediction in drug discovery with limited labeled data.

Ax Asif Tauhid, Sidahmed Benabderrahmane, Mohamad Altrabulsi, Ahamed Foisal, Talal Rahwan 2/17/2026

RPG-AE: Neuro-Symbolic Graph Autoencoders with Rare Pattern Mining for Provenance-Based Anomaly Detection

Graph autoencoder framework combining neuro-symbolic approaches with rare pattern mining for detecting APT cyberattacks in system provenance data.

Ax Jing-Cheng Pang, Liang Lu, Xian Tang, Kun Jiang, Sijie Wu, Kai Zhang, Xubin Li 2/17/2026

Reinforcement Learning with Promising Tokens for Large Language Models

Reinforcement learning technique filtering irrelevant tokens to improve LLM policy optimization by focusing on contextually relevant action spaces.

Ax David P. Woodruff, Vincent Cohen-Addad, Lalit Jain, Jieming Mao, Song Zuo, MohammadHossein Bateni, Simina Branzei, Michael P. Brenner, Lin Chen, Ying Feng, Lance Fortnow, Gang Fu, Ziyi Guan, Zahra Hadizadeh, Mohammad T. Hajiaghayi, Mahdi JafariRaviz, Adel Javanmard, Karthik C. S., Ken-ichi Kawarabayashi, Ravi Kumar, Silvio Lattanzi, Euiwoong Lee, Yi Li, Ioannis Panageas, Dimitris Paparas, Benjamin Przybocki, Bernardo Subercaseaux, Ola Svensson, Shayan Taherijam, Xuan Wu, Eylon Yogev, Morteza Zadimoghaddam, Samson Zhou, Yossi Matias, James Manyika, Vahab Mirrokni 2/17/2026

Accelerating Scientific Research with Gemini: Case Studies and Common Techniques

Case studies of Google's Gemini models assisting scientific research including mathematical discovery and routine task automation.

Ax Emiliano Penaloza, Dheeraj Vattikonda, Nicolas Gontier, Alexandre Lacoste, Laurent Charlin, Massimo Caccia 2/17/2026

Privileged Information Distillation for Language Models

Studying knowledge distillation from privileged information in language models for multi-turn agentic environments, addressing inference-time capability transfer.

Ax Merlin de la Haye, Pascal Lenzner, Farehe Soheil, Marcus Wunderlich 2/17/2026

Metric Hedonic Games on the Line

Game-theoretic analysis of coalition formation in hedonic games using metric spaces.

Ax Sohan Venkatesh, Ashish Mahendran Kurapath 2/17/2026

On the Non-Identifiability of Steering Vectors in Large Language Models

Demonstrates that steering vectors in LLMs are fundamentally non-identifiable due to large equivalence classes of behaviorally identical vectors.

Ax Shang Liu, Hanyu Pei, Zeyan Liu 2/17/2026

ShallowJail: Steering Jailbreaks against Large Language Models

Steering-based jailbreak method against aligned LLMs requiring less computation than white-box approaches but maintaining stealth.

Ax Yidong Jiang, Junrong Chen, Eftychia Makri, Jialin Chen, Peiwen Li, Ali Maatouk, Leandros Tassiulas, Eliot Brenner, Bing Xiang, Rex Ying 2/17/2026

Fin-RATE: A Real-world Financial Analytics and Tracking Evaluation Benchmark for LLMs on SEC Filings

Benchmark for evaluating LLM performance on financial analysis and tracking using SEC filings with multi-document synthesis.

Ax Peizhen Li, Longbing Cao, Xiao-Ming Wu, Yang Zhang 2/17/2026

VividFace: Real-Time and Realistic Facial Expression Shadowing for Humanoid Robots

Real-time facial expression imitation system for humanoid robots enabling lifelike affective human-robot interaction.

Ax Babak Rahmani 2/17/2026

Debugging code world models

Analyzes errors and limitations in Code World Models that simulate program execution by predicting runtime state.

Ax Jan Philip Wahle 2/17/2026

Language Modeling and Understanding Through Paraphrase Generation and Detection

Studies language understanding through paraphrase generation and detection capabilities in language models.

Ax Vid Kocijan, Jinu Sunil, Jan Eric Lenssen, Viman Deb, Xinwei Xe, Federico Reyes Gomez, Matthias Fey, Jure Leskovec 2/17/2026

Predictive Query Language: A Domain-Specific Language for Predictive Modeling on Relational Databases

Domain-specific language for predictive modeling on relational databases covering missing values and future predictions.

Ax Bojian Hou, Xiaolong Liu, Xiaoyi Liu, Jiaqi Xu, Yasmine Badr, Mengyue Hang, Sudhanshu Chanpuriya, Junqing Zhou, Yuhang Yang, Han Xu, Qiuling Suo, Laming Chen, Yuxi Hu, Jiasheng Zhang, Huaqing Xiong, Yuzhen Huang, Chao Chen, Yue Dong, Yi Yang, Shuo Chang, Xiaorui Gan, Wenlin Chen, Santanu Kolay, Darren Liu, Jade Nie, Chunzhi Yang, Ellie Wen, Jiyan Yang, Huayu Li 2/17/2026

Kunlun: Establishing Scaling Laws for Massive-Scale Recommendation Systems through Unified Architecture Design

Derives scaling laws for massive-scale recommendation systems through unified architecture design and efficiency improvements.