Isolater - Feed

Ax Yifan Sun, Jingyan Shen, Yibin Wang, Tianyu Chen, Zhendong Wang, Mingyuan Zhou, Huan Zhang 2/17/2026

Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay

Techniques for improving data efficiency in LLM RL fine-tuning using difficulty-targeted online selection and rollout replay.

Ax Yang Liu, Jiaqi Li, Zilong Zheng 2/17/2026

RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling

RuleReasoner method combining RL with domain-aware sampling for robust rule-based reasoning across varying rule formats and complexity.

Ax Boya Xiong, Shuo Wang, Weifeng Ge, Guanhua Chen, Yun Chen 2/17/2026

Enhancing Delta Compression in LLMs via SVD-based Quantization Error Minimization

SVD-based quantization method for compressing delta parameters from LLM fine-tuning with analysis of underlying compression mechanisms.

Ax Yuan Gao, Mattia Piccinini, Yuchen Zhang, Dingrui Wang, Korbinian Moller, Roberto Brusnicki, Baha Zarrouki, Alessio Gambi, Jan Frederik Totz, Kai Storms, Steven Peters, Andrea Stocco, Bassam Alrifaee, Marco Pavone, Johannes Betz 2/17/2026

Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis

Survey of foundation models for autonomous driving focusing on scenario generation and analysis for simulation-based testing.

Ax Xingyue Huang, Mikhail Galkin, Michael M. Bronstein, \.Ismail \.Ilkan Ceylan 2/17/2026

HYPER: A Foundation Model for Inductive Link Prediction with Knowledge Hypergraphs

HYPER foundation model for inductive link prediction with knowledge hypergraphs, generalizing to novel entities and relation types.

Ax Hanyu Pei, Jing-Xiao Liao, Qibin Zhao, Ting Gao, Shijun Zhang, Xiaoge Zhang, Feng-Lei Fan 2/17/2026

NeuronSeek: On Stability and Expressivity of Task-driven Neurons

NeuronSeek framework using symbolic regression to discover and construct neural networks with optimized task-driven neurons.

Ax Weike Zhao, Chaoyi Wu, Yanjie Fan, Xiaoman Zhang, Pengcheng Qiu, Yuze Sun, Xiao Zhou, Yanfeng Wang, Xin Sun, Ya Zhang, Yongguo Yu, Kun Sun, Weidi Xie 2/17/2026

An Agentic System for Rare Disease Diagnosis with Traceable Reasoning

DeepRare multi-agent system using LLMs with traceable reasoning for differential diagnosis of rare diseases through agentic workflow.

Ax Fabio Merizzi, Harilaos Loukos 2/17/2026

Vision Transformers for Multi-Variable Climate Downscaling: Emulating Regional Climate Models with a Shared Encoder and Multi-Decoder Architecture

Vision Transformer architecture with shared encoder and multi-decoder for climate model downscaling as efficient alternative to regional climate models.

Ax Yuta Sato, Kazuhiko Kawamoto, Hiroshi Kera 2/17/2026

Chain of Thought in Order: Discovering Learning-Friendly Orders for Arithmetic

Study on optimal ordering of chain-of-thought reasoning steps in Transformers for mathematical tasks, showing significant impact on reasoning difficulty.

Ax Zhenglun Kong, Mufan Qiu, John Boesen, Xiang Lin, Sukwon Yun, Tianlong Chen, Manolis Kellis, Marinka Zitnik 2/17/2026

SPATIA: Multimodal Generation and Prediction of Spatial Cell Phenotypes

SPATIA multimodal generative model for analyzing spatial transcriptomics data combining cell images, gene expression, and spatial context.

Ax Xiaohang Tang, Rares Dolga, Sangwoong Yoon, Ilija Bogunovic 2/17/2026

wd1: Weighted Policy Optimization for Reasoning in Diffusion Language Models

Weighted policy optimization method for improving reasoning in diffusion-based LLMs through RL without requiring exact likelihood computations.

Ax Xiaojie Li, Zhijie Cai, Nan Qi, Chao Dong, Guangxu Zhu, Haixia Ma, Qihui Wu, Shi Jin 2/17/2026

A Disentangled Representation Learning Framework for Low-altitude Network Coverage Prediction

Machine learning approach for predicting low-altitude network coverage using disentangled representation learning on base station operational parameters.

Ax Kaixian Qu, Guowei Lan, Ren\'e Zurbr\"ugg, Changan Chen, Christopher E. Mower, Haitham Bou-Ammar, Marco Hutter 2/17/2026

A Pragmatist Robot: Learning to Plan Tasks by Experiencing the Real World

DeepRobot system uses LLMs for robotic task planning with verbal RL feedback loop to align models with real-world robot embodiment and constraints.

Ax Tianfu Wang, Liwei Deng, Xi Chen, Junyang Wang, Huiguo He, Zhengyu Hu, Wei Wu, Leilei Ding, Qilin Fan, Hui Xiong 2/17/2026

Virne: A Comprehensive Benchmark for RL-based Network Resource Allocation in NFV

Virne benchmark framework for evaluating deep RL methods on network resource allocation in Network Function Virtualization infrastructure.

Ax Lakshya A Agrawal, Shangyin Tan, Dilara Soylu, Noah Ziems, Rishi Khare, Krista Opsahl-Ong, Arnav Singhvi, Herumb Shandilya, Michael J Ryan, Meng Jiang, Christopher Potts, Koushik Sen, Alexandros G. Dimakis, Ion Stoica, Dan Klein, Matei Zaharia, Omar Khattab 2/17/2026

GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning

GEPA uses genetic algorithms and Pareto optimization for prompt evolution as alternative to RL fine-tuning of LLMs, achieving better performance with fewer rollouts.

Ax Xuan Liu, Siru Ouyang, Xianrui Zhong, Jiawei Han, Huimin Zhao 2/17/2026

FGBench: A Dataset and Benchmark for Molecular Property Reasoning at Functional Group-Level in Large Language Models

FGBench dataset and benchmark for evaluating LLM reasoning on molecular property prediction at functional group level with structure-aware interpretability.

Ax Arturo S\'anchez-Matas, Pablo Escribano Ruiz, Daniel D\'iaz-L\'opez, Angel Luis Perales G\'omez, Pantaleone Nespoli, Gregorio Mart\'inez P\'erez 2/17/2026

Simulating Cyberattacks through a Breach Attack Simulation (BAS) Platform empowered by Security Chaos Engineering (SCE)

Framework integrating Security Chaos Engineering with Breach Attack Simulation platforms for testing organizational cyber defenses.

Ax Zhaomin Wu, Mingzhe Du, See-Kiong Ng, Bingsheng He 2/17/2026

Beyond Prompt-Induced Lies: Investigating LLM Deception on Benign Prompts

Research investigating unintended deception in LLMs on benign prompts without explicit hidden objectives, revealing trustworthiness risks in reasoning tasks.

Ax Md Sultanul Arifin, Abu Nowshed Sakib, Yeasir Rayhan, Tanzima Hashem 2/17/2026

Lightning Prediction under Uncertainty: DeepLight with Hazy Loss

DeepLight deep learning architecture for lightning prediction with Hazy Loss function to account for prediction uncertainty.

Ax Artzai Picon, Itziar Eguskiza, Daniel Mugica, Javier Romero, Carlos Javier Jimenez, Eric White, Gabriel Do-Lago-Junqueira, Christian Klukas, Ramon Navarra-Mestre 2/17/2026

Robust MultiSpecies Agricultural Segmentation Across Devices, Seasons, and Sensors Using Hierarchical DINOv2 Models

Agricultural segmentation framework using hierarchical DINOv2 models for robust plant species and damage detection across devices, seasons, and sensors.

Ax Rishikesh Devanathan, Varun Nathan, Ayush Kumar 2/17/2026

Why Synthetic Isn't Real Yet: A Diagnostic Framework for Contact Center Dialogue Generation

Diagnostic framework for evaluating synthetic dialogue generation for contact centers using structured supervision on call attributes.

Ax Konur Tholl, Mariam El Mezouar, Adrian Taylor, Ranwa Al Mallah 2/17/2026

Towards Production-Worthy Simulation for Autonomous Cyber Operations

Framework extending CybORG environment for training RL agents in autonomous cyber operations with production-worthy simulation accuracy.

Ax Hui Chen (Mo), Antoine Didisheim (Mo), Mohammad (Mo), Pourmohammadi, Luciano Somoza, Hanqing Tian 2/17/2026

A Financial Brain Scan of the LLM

Interpretability study brain-scanning LLMs to identify economic concepts guiding financial forecasts and map relative importance without performance reduction.

Ax Arjun Basandrai, Shourya Jain, K. Ilanthenral 2/17/2026

ART: Adaptive Resampling-based Training for Imbalanced Classification

Adaptive Resampling-based Training method dynamically adjusts training data distribution based on per-class learning difficulty for imbalanced classification.

Ax Konur Tholl, Fran\c{c}ois Rivest, Mariam El Mezouar, Adrian Taylor, Ranwa Al Mallah 2/17/2026

Large Language Model Integration with Reinforcement Learning to Augment Decision-Making in Autonomous Cyber Operations

Integration of LLMs with RL for autonomous cyber operations, using LLM pre-trained knowledge to augment agent decision-making and reduce exploration cost.

Ax Rafael Zimmer, Oswaldo Luiz do Valle Costa 2/17/2026

Reinforcement Learning-Based Market Making as a Stochastic Control on Non-Stationary Limit Order Book Dynamics

RL-based market making strategy modeling limit order book dynamics as stochastic control problem for algorithmic trading.

Ax Hugo Carlesso, Josiane Mothe, Radu Tudor Ionescu 2/17/2026

Curriculum Multi-Task Self-Supervision Improves Lightweight Architectures for Onboard Satellite Hyperspectral Image Segmentation

Lightweight satellite hyperspectral image segmentation using curriculum multi-task self-supervision for onboard processing.

Ax Phillipe R. Sampaio 2/17/2026

Complexity Bounds for Smooth Multiobjective Optimization

Theoretical analysis of oracle complexity for finding Pareto stationary points in smooth multiobjective optimization problems.

Ax Kaiwen Zheng, Huayu Chen, Haotian Ye, Haoxiang Wang, Qinsheng Zhang, Kai Jiang, Hang Su, Stefano Ermon, Jun Zhu, Ming-Yu Liu 2/17/2026

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

DiffusionNFT introduces online reinforcement learning for diffusion models using forward process, addressing limitations in post-training diffusion model optimization.

Ax Xuyang Ge, Wentao Shu, Jiaxing Wu, Yunhua Zhou, Zhengfu He, Xipeng Qiu 2/17/2026

Evolution of Concepts in Language Model Pre-Training

Interpretability research tracking feature evolution during language model pre-training using sparse dictionary learning (crosscoders) to understand capability emergence.

Ax Bingsheng Yao, Jiaju Chen, Chaoran Chen, April Wang, Toby Jia-jun Li, Dakuo Wang 2/17/2026

Through the Lens of Human-Human Collaboration: A Configurable Research Platform for Exploring Human-Agent Collaboration

Research platform for exploring human-agent collaboration using LLM agents, applying principles from human-mediated computer collaboration to human-LLM partnerships.

Ax Chen Liang, Zhaoqi Huang, Haofen Wang, Fu Chai, Chunying Yu, Huanhuan Wei, Zhengjie Liu, Yanpeng Li, Hongjun Wang, Ruifeng Luo, Xianzhong Zhao 2/17/2026

AECBench: A Hierarchical Benchmark for Knowledge Evaluation of Large Language Models in the AEC Field

AECBench benchmark evaluates LLM robustness and reliability in Architecture, Engineering, and Construction domain with hierarchical knowledge evaluation.

Ax Anton Korznikov, Andrey Galichin, Alexey Dontsov, Oleg Y. Rogov, Ivan Oseledets, Elena Tutubalina 2/17/2026

The Rogue Scalpel: Activation Steering Compromises LLM Safety

Research showing activation steering technique for controlling LLM behavior systematically breaks model alignment safeguards and makes models comply with harmful requests.

Ax Haodong Liang, Yanhao Jin, Krishnakumar Balasubramanian, Lifeng Lai 2/17/2026

Differentially Private Two-Stage Gradient Descent for Instrumental Variable Regression

ArXiv paper proposing differentially private two-stage gradient descent algorithm for instrumental variable regression with privacy-utility tradeoffs.

Ax Zeyu Shen, Basileal Imana, Tong Wu, Chong Xiang, Prateek Mittal, Aleksandra Korolova 2/17/2026

ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search

ArXiv paper proposing ReliabilityRAG, provably robust defense against prompt injection and retrieval corpus attacks on RAG-based web search systems.

Ax Zhaomin Wu, Haodong Zhao, Ziyang Wang, Jizhou Guo, Qian Wang, Bingsheng He 2/17/2026

LLM DNA: Tracing Model Evolution via Functional Representations

ArXiv paper introducing LLM DNA method for tracing evolutionary relationships between models via functional representations without task-specific constraints.

Ax Chi Zhang, Zehua Chen, Kaiwen Zheng, Jun Zhu 2/17/2026

VoiceBridge: General Speech Restoration with One-step Latent Bridge Models

ArXiv paper proposing VoiceBridge, one-step latent bridge model for general speech restoration from diverse distortions at 48 kHz fullband quality.

Ax Xin Xu, Xunzhi He, Churan Zhi, Ruizhe Chen, Julian McAuley, Zexue He 2/17/2026

BiasFreeBench: a Benchmark for Mitigating Bias in Large Language Model Responses

ArXiv paper introducing BiasFreeBench, standardized benchmark for evaluating and comparing bias mitigation methods in LLM responses with consistent metrics.

Ax Xinjie Shen, Mufei Li, Pan Li 2/17/2026

Measuring Physical-World Privacy Awareness of Large Language Models: An Evaluation Benchmark

ArXiv paper introducing EAPrivacy benchmark for measuring physical-world privacy awareness of LLM-powered embodied agents in procedurally generated scenarios.

Ax Yukun Zhang, Xueqing Zhou 2/17/2026

Where to Add PDE Diffusion in Transformers

ArXiv paper studying optimal placement of PDE diffusion layers in hybrid transformer architectures to add local geometric priors along sequence axis.

Ax Sahil Joshi, Agniva Chowdhury, Amar Kanakamedala, Ekam Singh, Evan Tu, Anshumali Shrivastava 2/17/2026

RACE Attention: A Strictly Linear-Time Attention for Long-Sequence Training

ArXiv paper proposing RACE Attention, a strictly linear-time attention mechanism enabling long-sequence training beyond quadratic softmax attention limitations.

Ax Buyun Liang, Liangzu Peng, Jinqi Luo, Darshan Thaker, Kwan Ho Ryan Chan, Ren\'e Vidal 2/17/2026

SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations

ArXiv paper introducing SECA, method for eliciting LLM hallucinations using semantically equivalent and coherent adversarial attacks to test reliability.

Ax Akira Kitaoka 2/17/2026

Inverse Mixed-Integer Programming: Learning Constraints then Objective Functions

ArXiv paper addressing data-driven inverse optimization for mixed-integer linear programs by learning both constraints and objective functions from observed decisions.

Ax Nizar El Ghazal, Antoine Caubri\`ere, Valentin Vielzeuf 2/17/2026

The Speech-LLM Takes It All: A Truly Fully End-to-End Spoken Dialogue State Tracking Approach

ArXiv paper comparing context management strategies for end-to-end spoken dialogue state tracking using Speech-LLMs on SpokenWOZ corpus.

Ax Aadithya Srikanth, Mudit Gaur, Vaneet Aggarwal 2/17/2026

Discrete State Diffusion Models: A Sample Complexity Perspective

ArXiv paper providing theoretical analysis of sample complexity in discrete-state diffusion models for text, sequences, and combinatorial structures.

Ax Marcel Meyer, Sascha Kaltenpoth, Kevin Zalipski, Oliver M\"uller 2/17/2026

Challenges and Requirements for Benchmarking Time Series Foundation Models

ArXiv paper examining benchmarking challenges for Time Series Foundation Models, addressing test set integrity issues as training corpora grow large.

Ax Samuel Lippl, Thomas McGee, Kimberly Lopez, Ziwen Pan, Pierce Zhang, Salma Ziadi, Oliver Eberle, Ida Momennejad 2/17/2026

Algorithmic Primitives and Compositional Geometry of Reasoning in Language Models

ArXiv paper introducing framework for tracing and steering algorithmic primitives underlying LLM multi-step reasoning by linking reasoning traces to internal activations.

Ax Dechen Zhang, Junwei Su, Difan Zou 2/17/2026

Learning under Quantization for High-Dimensional Linear Regression

ArXiv paper providing first theoretical analysis of low-bit quantization effects on learning performance in high-dimensional linear regression settings.

Ax Jusheng Zhang, Kaitong Cai, Jing Yang, Jian Wang, Chengpei Tang, Keze Wang 2/17/2026

Top-Down Semantic Refinement for Image Captioning

ArXiv paper proposing top-down semantic refinement technique for improving image captioning quality in Vision-Language Models through multi-step generation.

Ax Ranran Haoran Zhang, Soumik Dey, Ashirbad Mishra, Hansi Wu, Binbin Li, Rui Zhang 2/17/2026

Batch Speculative Decoding Done Right

ArXiv paper identifying critical correctness violations in existing batch speculative decoding implementations and proposing fixes to ensure output equivalence with standard autoregressive generation.