Isolater - Feed

Ax Jonathan Lys, Vincent Gripon, Bastien Pasdeloup, Lukas Mauch, Fabien Cardinaux, Ghouthi Boukli Hacene 2/17/2026

Residual Connections and the Causal Shift: Uncovering a Structural Misalignment in Transformers

Analysis of structural misalignment in Transformers between residual connections and causal masking in next-token prediction.

Ax Stefano Woerner, Seong Joon Oh, Christian F. Baumgartner 2/17/2026

Universal Algorithm-Implicit Learning

Theoretical framework for meta-learning defining practical universality and distinguishing algorithm-implicit learning approaches.

Ax Sara Rajaee, Sebastian Vincent, Alexandre Berard, Marzieh Fadaee, Kelly Marchisio, Tom Kocmi 2/17/2026

Unlocking Reasoning Capability on Machine Translation in Large Language Models

Evaluation of reasoning-oriented LLMs on machine translation showing explicit reasoning degrades translation quality.

Ax Shiwei Hong, Lingyao Li, Ethan Z. Rong, Chenxinran Shen, Zhicong Lu 2/17/2026

Multi-Agent Comedy Club: Investigating Community Discussion Effects on LLM Humor Generation

Multi-agent LLM system for comedy writing with community discussion feedback stored as social memory affecting output quality.

Ax Shih-Fang Chen, Jun-Cheng Chen, I-Hong Jhuo, Yen-Yu Lin 2/17/2026

GOT-JEPA: Generic Object Tracking with Model Adaptation and Occlusion Handling using Joint-Embedding Predictive Architecture

Object tracking model using joint-embedding predictive architecture with occlusion handling and model adaptation.

Ax Emanuele Ricco, Elia Onofri, Lorenzo Cima, Stefano Cresci, Roberto Di Pietro 2/17/2026

A Geometric Analysis of Small-sized Language Model Hallucinations

Geometric analysis of hallucinations in small-sized LLMs through embedding space clustering in multi-step and agentic settings.

Ax Beno\^it Dupont, Chad Whelan, Serge-Olivier Paquette 2/17/2026

What hackers talk about when they talk about AI: Early-stage diffusion of a cybercrime innovation

Analysis of cybercriminal discussions about AI adoption from cyber threat intelligence forum data.

Ax Yubin Cho, Hyunwoo Yu, Kyeongbo Kong, Kyomin Sohn, Bongjoon Hyun, Suk-Ju Kang 2/17/2026

VIPA: Visual Informative Part Attention for Referring Image Segmentation

Framework for referring image segmentation using visual attention mechanisms to exploit context for fine-grained object segmentation.

Ax Pengcheng Pan, Yonekura Shogo, Yasuo Kuniyosh 2/17/2026

Debiasing Central Fixation Confounds Reveals a Peripheral "Sweet Spot" for Human-like Scanpaths in Hard-Attention Vision

Vision model study on scanpath metrics and center bias in hard-attention models using gaze tracking datasets.

Ax Bardia Mohammadi, Nearchos Potamitis, Lars Klein, Akhil Arora, Laurent Bindschaedler 2/17/2026

Atomix: Timely, Transactional Tool Use for Reliable Agentic Workflows

Atomix: Runtime providing transactional semantics for LLM agent tool calls with epoch tagging and safe rollback mechanisms for reliable agentic workflows.

Ax Pierre-Alexandre Mattei, Bruno Loureiro 2/17/2026

The Well-Tempered Classifier: Some Elementary Properties of Temperature Scaling

Theoretical analysis of temperature scaling properties for controlling uncertainty in probabilistic models and LLM stochasticity.

Ax Ilia Mahrooghi, Aryo Lotfi, Emmanuel Abbe 2/17/2026

Goldilocks RL: Tuning Task Difficulty to Escape Sparse Rewards for Reasoning

Goldilocks RL uses adaptive curriculum learning to optimize task difficulty and improve sample efficiency in reasoning model training.

Ax Yu Huang, Zixin Wen, Yuejie Chi, Yuting Wei, Aarti Singh, Yingbin Liang, Yuxin Chen 2/17/2026

On the Learning Dynamics of RLVR at the Edge of Competence

Theoretical analysis of RLVR training dynamics explaining how outcome-based rewards enable long-horizon reasoning in transformers.

Ax Qingqing Zhu, Qiao Jin, Tejas S. Mathai, Yin Fang, Zhizheng Wang, Yifan Yang, Maame Sarfo-Gyamfi, Benjamin Hou, Ran Gu, Praveen T. S. Balamuralikrishna, Kenneth C. Wang, Ronald M. Summers, Zhiyong Lu 2/17/2026

CT-Bench: A Benchmark for Multimodal Lesion Understanding in Computed Tomography

CT-Bench multimodal dataset with 20,335 lesions from CT studies for training AI models on lesion understanding and report generation.

Ax Eloi Martinet, Ilias Ftouhi 2/17/2026

Numerical exploration of the range of shape functionals using neural networks

Neural network framework for exploring shape functionals and Blaschke-Santaló diagrams in convex geometry optimization.

Ax Pramit Saha, Joshua Strong, Mohammad Alsharid, Divyanshu Mishra, J. Alison Noble 2/17/2026

Picking the Right Specialist: Attentive Neural Process-based Selection of Task-Specialized Models as Tools for Agentic Healthcare Systems

Neural process-based method for selecting specialized models as tools in agentic healthcare systems for multi-task clinical queries.

Ax Fiorenzo Parascandolo, Wenhui Tan, Enver Sangineto, Ruihua Song, Rita Cucchiara 2/17/2026

BFS-PO: Best-First Search for Large Reasoning Models

BFS-PO RL algorithm optimizes inference efficiency in large reasoning models by reducing overthinking and computational costs.

Ax Tianyi Ma, Yiyue Qian, Zehong Wang, Zheyuan Zhang, Chuxu Zhang, Yanfang Ye 2/17/2026

BHyGNN+: Unsupervised Representation Learning for Heterophilic Hypergraphs

BHyGNN+ unsupervised representation learning approach for heterophilic hypergraph neural networks.

Ax Zun Wang, Han Lin, Jaehong Yoon, Jaemin Cho, Yue Zhang, Mohit Bansal 2/17/2026

AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories

AnchorWeave method for maintaining spatial consistency in long-horizon camera-controllable video generation using local spatial memories.

Ax Yian Wang, Han Yang, Minghao Guo, Xiaowen Qiu, Tsun-Hsuan Wang, Wojciech Matusik, Joshua B. Tenenbaum, Chuang Gan 2/17/2026

PhyScensis: Physics-Augmented LLM Agents for Complex Physical Scene Arrangement

PhyScensis uses LLM agents with physics reasoning to generate realistic 3D scene arrangements for robotic simulation data collection.

Ax Ayush Shrivastava, Kirtan Gangani, Laksh Jain, Mayank Goel, Nipun Batra 2/17/2026

ThermEval: A Structured Benchmark for Evaluation of Vision-Language Models on Thermal Imagery

ThermEval benchmark for evaluating vision-language models on thermal imagery for applications like surveillance and autonomous driving.

Ax Tim Mangliers, Bernhard M\"ossner, Benjamin Himpel 2/17/2026

Spectral Convolution on Orbifolds for Geometric Deep Learning

Spectral convolution techniques for geometric deep learning on non-Euclidean data structures like graphs and manifolds.

Ax Avinandan Bose, Shuyue Stella Li, Faeze Brahman, Pang Wei Koh, Simon Shaolei Du, Yulia Tsvetkov, Maryam Fazel, Lin Xiao, Asli Celikyilmaz 2/17/2026

Cold-Start Personalization via Training-Free Priors from Structured World Models

Cold-start personalization method using structured world models and RL to infer user preferences with limited interaction budget.

Ax Cai Zhou, Zijie Chen, Zian Li, Jike Wang, Kaiyi Jiang, Pan Li, Rose Yu, Muhan Zhang, Stephen Bates, Tommi Jaakkola 2/17/2026

Rethinking Diffusion Models with Symmetries through Canonicalization with Applications to Molecular Graph Generation

Research on diffusion models using canonicalization to handle symmetries in molecular graph generation tasks.

Ax Shangding Gu 2/17/2026

Long Context, Less Focus: A Scaling Gap in LLMs Revealed through Privacy and Personalization

PAPerBench benchmark studies how context length in LLMs affects privacy leakage and personalization quality across large-scale evaluation.

Ax S{\o}ren Riis 2/17/2026

Mastering NIM and Impartial Games with Weak Neural Networks: An AlphaZero-inspired Multi-Frame Approach

Study on game-playing weak neural networks under fixed-scale quantization, proving representational barriers for impartial game mastery.

Ax Corina Catarau-Cotutiu, Esther Mondragon, Eduardo Alonso 2/17/2026

A representational framework for learning and encoding structurally enriched trajectories in complex agent environments

Framework for learning enriched trajectory representations enabling AI agents to make better decisions across different domains and tasks.

Ax Jiapeng Wang, Jinhao Jiang, Zhiqiang Zhang, Jun Zhou, Wayne Xin Zhao 2/17/2026

RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library

RV-Syn: data synthesis method for generating high-quality mathematical reasoning data using structured function libraries for LLM training.

Ax Shuai Yang, Qi Yang, Luoxi Tang, Yuqiao Meng, Nancy Guo, Jeremy Blackburn, Zhaohan Xi 2/17/2026

On the Eligibility of LLMs for Counterfactual Reasoning: A Decompositional Study

Decompositional study analyzing which factors impede LLM performance on counterfactual reasoning tasks and generalizing reasoning capabilities.

Ax Matthew Kowal, Jasper Timm, Jean-Francois Godbout, Thomas Costello, Antonio A. Arechar, Gordon Pennycook, David Rand, Adam Gleave, Kellin Pelrine 2/17/2026

It's the Thought that Counts: Evaluating the Attempts of Frontier LLMs to Persuade on Harmful Topics

Benchmark evaluating persuasion capabilities of frontier LLMs on harmful topics, assessing model propensity for harmful persuasion attempts.

Ax Zeju Li, Jianyuan Zhong, Ziyang Zheng, Xiangyu Wen, Zhijian Xu, Yingying Cheng, Fan Zhang, Qiang Xu 2/17/2026

Making Slow Thinking Faster: Compressing LLM Chain-of-Thought via Step Entropy

CoT compression framework using step entropy metrics to reduce redundancy in LLM chain-of-thought reasoning and inference costs.

Ax Sviatoslav Lushnei, Dmytro Shumskyi, Severyn Shykula, Ernesto Jimenez-Ruiz, Artur d'Avila Garcez 2/17/2026

Large Language Models as Oracles for Ontology Alignment

Using LLMs as oracles for ontology alignment with human-in-the-loop approaches to improve mapping quality for large ontologies.

Ax Muhammed Ustaomeroglu, Baris Askin, Gauri Joshi, Carlee Joe-Wong, Guannan Qu 2/17/2026

Internal Planning in Language Models: Characterizing Horizon and Branch Awareness

Analysis of planning capabilities in decoder-only language models, examining horizon and branch awareness in transformer architectures.

Ax Divij Handa, Mihir Parmar, Aswin RRV, Md Nayem Uddin, Hamid Palangi, Chitta Baral 2/17/2026

GuidedSampling: Steering LLMs Towards Diverse Candidate Solutions at Inference-Time

GuidedSampling: inference-time algorithm steering LLMs to generate diverse candidate solutions, improving performance on complex tasks.

Ax Qingni Wang, Yue Fan, Xin Eric Wang 2/17/2026

SAFER: Risk-Constrained Sample-then-Filter in Large Language Models

SAFER method for risk-constrained sampling in LLMs to ensure trustworthy outputs in risk-sensitive applications like question answering.

Ax Caorui Li, Yu Chen, Yiyan Ji, Jin Xu, Zhenyu Cui, Shihao Li, Yuanxing Zhang, Wentao Wang, Zhenghao Song, Dingling Zhang, Ying He, Haoxiang Liu, Yuxuan Wang, Qiufeng Wang, Jiafu Tang, Zhenhe Wu, Jiehui Luo, Zhiyu Pan, Weihao Xie, Chenchen Zhang, Zhaohui Wang, Jiayi Tian, Yanghai Wang, Zhe Cao, Minxin Dai, Ke Wang, Runzhe Wen, Yinghao Ma, Yaning Pan, Sungkyun Chang, Termeh Taheri, Haiwen Xia, Christos Plachouras, Emmanouil Benetos, Yizhi Li, Ge Zhang, Jian Yang, Tianhao Peng, Zili Wang, Minghao Liu, Junran Peng, Zhaoxiang Zhang, Jiaheng Liu 2/17/2026

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

OmniVideoBench: evaluation benchmark for multimodal LLMs on audio-visual understanding tasks with comprehensive synergistic reasoning assessment.

Ax Shiqi Zhang, Xinbei Ma, Yunqing Xu, Zouying Cao, Pengrui Lu, Haobo Yuan, Tiancheng Shen, Zhuosheng Zhang, Hai Zhao, Ming-Hsuan Yang 2/17/2026

ParaCook: On Time-Efficient Planning for Multi-Agent Systems

ParaCook benchmark for evaluating time-efficient collaborative planning in multi-agent systems using LLMs for long-horizon reasoning.

Ax Ni Zhang, Zhiguang Cao, Jianan Zhou, Cong Zhang, Yew-Soon Ong 2/17/2026

An Agentic Framework with LLMs for Solving Complex Vehicle Routing Problems

Agentic framework using LLMs to solve complex vehicle routing problems with autonomous decision-making and improved solution feasibility.

Ax Minwei Kong, Ao Qu, Xiaotong Guo, Wenbin Ouyang, Chonghe Jiang, Han Zheng, Yining Ma, Dingyi Zhuang, Yuhan Tang, Junyi Li, Shenhao Wang, Haris Koutsopoulos, Hai Wang, Cathy Wu, Jinhua Zhao 2/17/2026

AlphaOPT: Formulating Optimization Programs with Self-Improving LLM Experience Library

AlphaOPT uses LLMs with self-improving experience libraries to automate optimization problem formulation from natural language into mathematical models and solver code.

Ax Gyuyeon Na, Minjung Park, Hyeonjeong Cha, Sangmi Chai 2/17/2026

Human-Centered LLM-Agent System for Detecting Anomalous Digital Asset Transactions

HCLA: human-centered multi-agent system for anomaly detection in digital asset transactions using conversational workflow.

Ax Xinyuan Wang, Hongyu Cao, Kunpeng Liu, Yanjie Fu 2/17/2026

Dataforge: Agentic Platform for Autonomous Data Engineering

Dataforge: LLM-powered agentic platform for autonomous data engineering including cleaning, normalization, and feature engineering.

Ax Qile Jiang, George Karniadakis 2/17/2026

AgenticSciML: Collaborative Multi-Agent Systems for Emergent Discovery in Scientific Machine Learning

AgenticSciML: multi-agent system with 10+ specialized agents for automated design of scientific machine learning architectures.

Ax Sejin Kim, Hayan Choi, Seokki Lee, Sundong Kim 2/17/2026

ARCTraj: A Dataset and Benchmark of Human Reasoning Trajectories for Abstract Problem Solving

ARCTraj: dataset of human reasoning trajectories on abstract visual reasoning tasks with temporal action sequences.

Ax Robab Aghazadeh Chakherlou, Siddartha Khastgir, Xingyu Zhao, Jerein Jeyachandran, Shufeng Chen 2/17/2026

Uncertainty-Aware Measurement of Scenario Suite Representativeness for Autonomous Systems

Method for measuring representativeness of scenario datasets for autonomous vehicle testing and safety assurance.

Ax Yizhi Wang, Linan Yue, Min-Ling Zhang 2/17/2026

Training Multimodal Large Reasoning Models Needs Better Thoughts: A Three-Stage Framework for Long Chain-of-Thought Synthesis and Selection

Three-stage framework for synthesizing and selecting long chain-of-thought training data for multimodal large reasoning models.

Ax Ariana Azarbal, Victor Gillioz, Vladimir Ivanov, Bryce Woodworth, Jacob Drori, Nevan Wichers, Aram Ebtekar, Alex Cloud, Alexander Matt Turner 2/17/2026