Isolater - Feed

Ax Hengrui Gu, Xiaotian Han, Yujing Bian, Kaixiong Zhou 27d ago

Rethinking Exploration in RLVR: From Entropy Regularization to Refinement via Bidirectional Entropy Modulation

Improves exploration in reinforcement learning with verifiable rewards for LLMs using bidirectional entropy modulation instead of standard regularization.

Ax LM-Provers, Yuxiao Qu, Amrith Setlur, Jasper Dekoninck, Edward Beeching, Jia Li, Ian Wu, Lewis Tunstall, Aviral Kumar 27d ago

QED-Nano: Teaching a Tiny Model to Prove Hard Theorems

QED-Nano trains small neural networks to prove mathematical theorems, enabling reproducible and efficient theorem-proving without large models.

Ax Mohammad Zangooei, Jannis Weil, Amr Rizk, Mina Tahmasbi Arashloo, Raouf Boutaba 27d ago

Analyzing Symbolic Properties for DRL Agents in Systems and Networking

Verification and analysis of symbolic properties in deep reinforcement learning agents for systems and networking tasks.

Ax Zhen Zhang, Shanqing Liu, Alessandro Alla, Jerome Darbon, George Em Karniadakis 27d ago

PINNs in PDE Constrained Optimal Control Problems: Direct vs Indirect Methods

Physics-informed neural networks for optimal control of PDEs using direct and indirect formulations.

Ax Parsa Hosseini, Sumit Nawathe, Mahdi Salmani, Meisam Razaviyayn, Soheil Feizi 27d ago

Early Stopping for Large Reasoning Models via Confidence Dynamics

Method for early stopping in large language model reasoning by analyzing confidence dynamics to reduce computational cost without degrading performance.

Ax Farzane Aminmansour, Taher Jafferjee, Ehsan Imani, Erin Talvitie, Micheal Bowling, Martha White 27d ago

Mitigating Value Hallucination in Dyna Planning via Multistep Predecessor Models

Addresses value hallucination in Dyna-style reinforcement learning agents by using multistep predecessor models to improve model-based RL.

Ax Daniele Zambon, Andrea Cini, Cesare Alippi 27d ago

Graph State-Space Models and Latent Relational Inference

State-space models with relational inductive biases for multivariate time series prediction using graph structures.

Ax Yikun Ban, Yuchen Yan, Arindam Banerjee, Jingrui He 27d ago

Neural Exploitation and Exploration of Contextual Bandits

Neural networks applied to contextual multi-armed bandits, comparing epsilon-greedy, Thompson Sampling, and UCB techniques for exploration-exploitation trade-offs.

Ax Kayhan Behdin, Wenyu Chen, Rahul Mazumder 27d ago

Sparse Gaussian Graphical Models with Discrete Optimization: Computational and Statistical Perspectives

GraphL0BnB learns sparse precision matrices in Gaussian graphical models using discrete optimization with ℓ0 penalties.

Ax Mengchu Li, Ye Tian, Yang Feng, Yi Yu 27d ago

Federated Transfer Learning with Differential Privacy

Federated transfer learning framework addressing data heterogeneity and privacy across distributed sites using differential privacy.

Ax M. Rostami, S. S. Kia 27d ago

FedScalar: Federated Learning with Scalar Communication for Bandwidth-Constrained Networks

FedScalar reduces federated learning communication overhead by encoding high-dimensional updates as two scalar values per agent per round.

Ax Gavin Kerrigan, Kai Nelson, Padhraic Smyth 27d ago

EventFlow: Forecasting Temporal Point Processes with Flow Matching

EventFlow uses flow matching to forecast temporal point processes with irregular event intervals, improving on autoregressive neural approaches.

Ax Ricardo Gama, Ricardo Cunha, Daniel Fuertes, Carlos R. del-Blanco, Hugo L. Fernandes 27d ago

Multi-Agent Environments for Vehicle Routing Problems

Open-source RL framework for vehicle routing problems, extending reinforcement learning to discrete optimization in operations research.

Ax Zhouxing Shi, Haoyu Li, Cho-Jui Hsieh, Huan Zhang 27d ago

Certified Training with Branch-and-Bound for Lyapunov-stable Neural Control

Framework for training verifiably Lyapunov-stable neural controllers using branch-and-bound certified training within region-of-attraction.

Ax Cen-You Li, Marc Toussaint, Barbara Rakitsch, Christoph Zimmer 27d ago

Amortized Safe Active Learning for Real-Time Data Acquisition: Pretrained Neural Policies From Simulated Nonparametric Functions

Safe active learning method using amortized neural policies for real-time data acquisition with safety constraints, replacing repeated GP updates.

Ax Yijia Zhao, Qing Zhou 27d ago

Causal Bandit Over Unknown Graphs: Upper Confidence Bounds With Backdoor Adjustment

Paper on causal bandit algorithms for unknown DAGs using confidence bounds and backdoor adjustment for intervention discovery.

Ax Jiamin Xu, Ivan Nazarov, Aditya Rastogi, \'Africa Peri\'a\~nez, Kyra Gan 27d ago

From Restless to Contextual: A Thresholding Bandit Reformulation For Finite-horizon Improvement

Research on finite-horizon restless bandit problems reformulated as thresholding with improved sample complexity and policy convergence.

Ax Cheng Fang, Rishabh Dixit, Waheed U. Bajwa, Mert Gurbuzbalaban 27d ago

RESIST: Resilient Decentralized Learning Using Consensus Gradient Descent

arXiv paper on decentralized learning using consensus gradient descent with privacy and communication constraints across networked devices.

Ax Ganghua Wang, Yuhong Yang, Jie Ding 27d ago

Model Privacy: A Unified Framework for Understanding Model Stealing Attacks and Defenses

Research paper on model stealing attacks and defenses, analyzing vulnerabilities of ML services to adversarial extraction through query access.

Ax Binxu Wang, Cengiz Pehlevan 27d ago

An Analytical Theory of Spectral Bias in the Learning Dynamics of Diffusion Models

Analytical framework explaining spectral bias in diffusion model training dynamics using Gaussian equivalence and probability-flow ODEs.

Ax Tongrui Su, Qingbin Li, Shengyu Zhu, Wei Chen, Xueqi Cheng 27d ago

RaPA: Enhancing Transferable Targeted Attacks via Random Parameter Pruning

RaPA improves transferable targeted adversarial attacks by random parameter pruning to reduce reliance on surrogate model subsets.

Ax Zaiwei Chen, Phalguni Nanda 27d ago

From Set Convergence to Pointwise Convergence: Finite-Time Guarantees for Average-Reward Q-Learning with Adaptive Stepsizes

Finite-time convergence analysis for average-reward Q-learning with adaptive stepsizes, showing O(1/k) convergence rate.

Ax Dip Roy, Rajiv Misra, Sanjay Kumar Singh, Anisha Roy 27d ago

A Multi-Level Causal Intervention Framework for Mechanistic Interpretability in Variational Autoencoders

First mechanistic interpretability framework for VAEs using multi-level causal interventions to understand generative model representations.

Ax Yue Deng, Asadullah Hill Galib, Xin Lan, Jack Gunn, Pang-Ning Tan, Lifeng Luo 27d ago

FABLE: A Localized, Targeted Adversarial Attack on Weather Forecasting Models

FABLE framework investigates adversarial attacks on deep learning weather forecasting models and proposes targeted attack methods.

Ax Yutian He, Yankun Huang, Yao Yao, Qihang Lin 27d ago

Enforcing Fair Predicted Scores on Intervals of Percentiles by Difference-of-Convex Constraints

Proposes fairness constraints using difference-of-convex programming for partial fairness in ML predictions across percentile intervals.

Ax Andrew Nam, Declan Campbell, Thomas Griffiths, Jonathan Cohen, Sarah-Jane Leslie 27d ago

Understanding Task Representations in Neural Networks via Bayesian Ablation

Introduces Bayesian ablation framework for interpreting learned task representations in neural networks through probabilistic inference.

Ax Shibo Feng, Zhicheng Chen, Xi Xiao, Zhong Zhang, Qing Li, Xingyu Gao, Peilin Zhao 27d ago

MSDformer: Multi-scale Discrete Transformer For Time Series Generation

MSDformer extends discrete token modeling for time series generation using multi-scale transformer architecture to capture temporal patterns.

Ax Fengqing Jiang, Fengbo Ma, Zhangchen Xu, Yuetai Li, Zixin Rao, Bhaskar Ramasubramanian, Luyao Niu, Bo Li, Xianyan Chen, Zhen Xiang, Radha Poovendran 27d ago

SoSBench: Benchmarking Safety Alignment on Six Scientific Domains

SoSBench benchmarks safety alignment of LLMs across six scientific domains with sophisticated risks beyond basic misuse scenarios.

Ax Patrick Vossler, Fan Xia, Yifan Mai, Adarsh Subbaswamy, Jean Feng 27d ago

LLMs Judging LLMs: A Simplex Perspective

Studies the problem of using LLMs as judges for evaluating LLM outputs, addressing epistemic uncertainty in judge quality beyond sampling variability.

Ax Narmeen Oozeer, Luke Marks, Shreyans Jain, Fazl Barez, Amirali Abdullah 27d ago

Beyond Linear Steering: Unified Multi-Attribute Control for Language Models

K-Steering enables unified multi-attribute control of LLMs at inference time using non-linear classifiers on hidden activations to handle attribute interference.

Ax Wei Shen, Zhang Yaxiang, Minhui Huang, Mengfan Xu, Jiawei Zhang, Cong Shen 27d ago

MLorc: Momentum Low-rank Compression for Memory Efficient Large Language Model Adaptation

MLorc proposes momentum low-rank compression for memory-efficient LLM fine-tuning, reducing memory demands compared to LoRA while maintaining performance.

Ax Haoye Lu, Darren Lo, Yaoliang Yu 27d ago

SFBD Flow: A Continuous-Optimization Framework for Training Diffusion Models with Noisy Samples

SFBD Flow framework trains diffusion models on corrupted/noisy data with clean samples to reduce privacy risks and improve convergence in generative modeling.

Ax Hanbing Liu, Lang Cao, Yuanyi Ren, Mengyu Zhou, Haoyu Dong, Xiaojun Ma, Shi Han, Dongmei Zhang 27d ago

Not All Tokens Matter: Towards Efficient LLM Reasoning via Token Significance in Reinforcement Learning

Token significance approach in RL for efficient LLM reasoning by identifying and prioritizing important tokens over length optimization.

Ax Biying Zhou, Nanyu Luo, Feng Ji 27d ago

Federated Item Response Models: A Gradient-driven Privacy-preserving Framework for Distributed Psychometric Estimation

Federated Item Response Theory (FedIRT) framework enabling distributed psychometric estimation without centralizing raw response data.

Ax El Mehdi Achour, Kathl\'en Kohn, Holger Rauhut 27d ago

The Riemannian Geometry Associated to Gradient Flows of Linear Convolutional Networks

Riemannian geometry analysis of gradient flows for learning deep linear convolutional networks under balancedness conditions.

Ax Fouad Oubari, Mohamed El-Baha, Raphael Meunier, Rodrigue D\'ecatoire, Mathilde Mougeot 27d ago

Multi-Component VAE with Gaussian Markov Random Field

Multi-component VAE using Gaussian Markov Random Fields for generative modeling of complex datasets with intricate dependencies.

Ax Turan Orujlu, Christian Gumbsch, Martin V. Butz, Charley M Wu 27d ago

Causal Process Models: Reframing Dynamic Causal Graph Discovery as a Reinforcement Learning Problem

Causal Process Models for learning sparse time-varying causal graphs from visual observations using reinforcement learning.

Ax Katherine Avery, Chinmay Pendse, David Jensen 27d ago

Evaluating and Learning Robust Bandit Policies Under Uncertain Causal Mechanisms

Causal multi-armed bandit algorithm reasoning under uncertain causal mechanisms from graphical models.

Ax Federico Zucchi, Thomas Lampert 27d ago

PRISM: Lightweight Multivariate Time-Series Classification through Symmetric Multi-Resolution Convolutional Layers

PRISM lightweight convolutional classifier for multivariate time-series with multi-scale temporal dependencies and low parameter count.

Ax Ze Tao, Hanxuan Wang, Fujun Liu 27d ago

LNN-PINN: A Unified Physics-Only Training Framework with Liquid Residual Blocks

LNN-PINN physics-informed neural network with liquid residual blocks for improved predictive accuracy on complex problems.

Ax Daniel Beaglehole, David Holzm\"uller, Adityanarayanan Radhakrishnan, Mikhail Belkin 27d ago

xRFM: Accurate, scalable, and interpretable feature learning models for tabular data

xRFM feature learning models for tabular data providing accurate, scalable, and interpretable alternatives to gradient boosted trees.

Ax Pracheta Amaranath, Vinitra Muralikrishnan, Amit Sharma, David Jensen 27d ago

Improving Generative Methods for Causal Evaluation via Simulation-Based Inference

Simulation-based inference methods for generating synthetic datasets to evaluate causal estimators with varying treatment effects.

Ax Alvaro Almeida Gomez 27d ago

A Data-Driven Interpolation Method on Smooth Manifolds via Diffusion Processes and Voronoi Tessellations

Data-driven interpolation method on smooth manifolds using diffusion processes and Voronoi tessellations without training.

Ax Zhiyuan Huang, Jiahao Chen, Bing Su 27d ago

LoFT: Parameter-Efficient Fine-Tuning for Long-tailed Semi-Supervised Learning in Open-World Scenarios

LoFT parameter-efficient fine-tuning approach for long-tailed semi-supervised learning leveraging foundation models.

Ax Vincent Grari, Tim Arni, Thibault Laugel, Sylvain Lamprier, James Zou, Marcin Detyniecki 27d ago

ACT: Agentic Classification Tree

Agentic Classification Tree (ACT) combining LLMs with decision trees for transparent, interpretable decisions on unstructured data.

Ax Kazuki Egashira, Robin Staab, Thibaud Gloaguen, Mark Vero, Martin Vechev 27d ago

Fewer Weights, More Problems: A Practical Attack on LLM Pruning

Security analysis exposing vulnerabilities in LLM weight pruning methods used by inference engines like vLLM.

Ax Phalguni Nanda, Zaiwei Chen 27d ago

A Minimal-Assumption Analysis of Q-Learning with Time-Varying Policies

Theoretical finite-time analysis of Q-learning with time-varying policies under minimal assumptions for Markov decision processes.

Ax Mengqi Li, Lei Zhao, Anthony Man-Cho So, Ruoyu Sun, Xiao Li 27d ago

A Model Can Help Itself: Reward-Free Self-Training for LLM Reasoning

Self-evolving Post-Training (SePT) method enabling LLMs to improve reasoning without external rewards through self-generated training data.

Ax Matthew Lowery, Zhitong Xu, Da Long, Keyan Chen, Daniel S. Johnson, Yang Bai, Varun Shankar, Shandian Zhe 27d ago

Deep Gaussian Processes for Functional Maps

Deep Gaussian Processes for learning mappings between functional spaces with uncertainty quantification for spatiotemporal forecasting and climate modeling.

Ax Xingtu Liu 27d ago

An Information-Theoretic Analysis of OOD Generalization in Meta-Reinforcement Learning

Information-theoretic analysis of out-of-distribution generalization in meta-reinforcement learning with bounds under distribution shift scenarios.