Isolater - Feed

Ax Yongcan Huang, Li Jiang, Ze Yu Liu 19d ago

Evaluating the Generalizability of Foundation Models for Extreme Environmental Events: Case Study of California Wildfire PM2.5

Evaluation of time series foundation models on wildfire PM2.5 forecasting, assessing generalization under extreme out-of-distribution conditions.

Ax Tommaso Cerruti, Tim Rieder, George Rowlands, Lingfeng Jin, Imanol Schlag 19d ago

Linear Attention Architectures: Mechanisms, Trade-offs, and Cross-Layer Routing

Comparative study of linear attention architectures versus softmax attention, analyzing efficiency trade-offs for long context processing.

Ax Donghyun Lee, Yuhang Li, Ruokai Yin, Priyadarshini Panda 19d ago

KronQ: LLM Quantization via Kronecker-Factored Hessian

KronQ post-training quantization framework using Kronecker-factored Hessian for improved LLM compression beyond standard PTQ methods.

Ax Nivasini Ananthakrishnan, Mark Bedaywi, Michael I. Jordan, Stuart Russell, Nika Haghtalab 19d ago

Provably Optimal Learning Algorithms for Assistance Games

Provably efficient learning algorithms for assistance games where informed and uninformed agents repeatedly interact with shared objectives.

Ax Shuo Huai, Di Liu, Hao Kong, Xiangzhong Luo, Weichen Liu, Ravi Subramaniam, Christian Makaya, Qian Lin 19d ago

Collate: Collaborative Neural Network Learning for Latency-Critical Edge Systems

Federated learning system optimizing inference latency for heterogeneous edge devices in real-time collaborative neural network training.

Ax Weiheng Zhong, Jing Bi, Victor Oancea, Hadi Meidani 19d ago

PGD-NO: A Neural Operator with Precomputed Geometry Decomposition for 3D Million-scale Physics Simulations

Neural operator architecture using precomputed geometry decomposition for scaling physics simulations to million-scale 3D problems.

Ax Hyeju Shin, Chorwon Kim, Ryangsoo Kim, Hark Yoo, Jaein Kim 19d ago

Rethinking Small VLM Quantization: From Component-Wise Analysis to Hardware-Aware Edge Deployment

Systematic evaluation of quantization for small vision-language models with hardware-aware deployment on edge devices like Jetson Orin.

Ax Ashwin Gerard Colaco, Nada Lahjouji 19d ago

What to Keep, What to Forget: A Rate--Distortion View of Memory Compaction in LLMs and Agents

Rate-distortion framework for memory compaction in LLMs and agents, analyzing KV cache, prompt, and state compression trade-offs.

Ax Henry Hunt, Mason Kamb, Surya Ganguli 19d ago

An exact information theory of generalization phase transitions in Bayesian diffusion models

Theoretical analysis of how Bayesian diffusion models avoid overfitting through information-theoretic lens with analytically tractable models.

Ax Samuel Tetteh, Udip Shrestha, Joshua R. Waite, Cody Fleming 19d ago

Who Analyses the Analyser? Self-Validating LLM Hazard Analysis with Constitutional Meta-STPA

Framework for validating LLM-assisted safety analysis tools using Constitutional Meta-STPA to identify hallucinations in system analysis.

Ax Yidong Ouyang, Zhe Wang, Sourav Bhabesh, Dmitriy Bespalov 19d ago

Reinforcing the Generation Order of Multimodal Masked Diffusion Models

Research on optimizing token generation order in diffusion models for text-to-image synthesis and multimodal understanding tasks.

Ax Jiantong Jiang, Peiyu Yang, Rui Zhang, Feng Liu 19d ago

Towards Efficient Large Language Model Serving: A Survey on System-Aware KV Cache Optimization

Survey on system-aware KV cache optimization techniques for efficient LLM serving, addressing memory-intensive inference bottlenecks.

Ax Mayank Singal 19d ago

When Thinking Hurts: Epistemic Signals in the Reasoning Chains of Visual Language Models

Empirical characterization of uncertainty quantification in vision language models with chain-of-thought reasoning.

Ax Ethan Roland, Murat Cubuktepe, Erick Martinez, Stijn Servaes, Keenan Pepper, Mike Vaiana, Diogo Schwerz de Lucena, Judd Rosenblatt, Addie Foote, Cem Anil, Alex Cloud 19d ago

Modular Pretraining Enables Access Control

Modular pretraining framework enabling fine-grained access control over AI capabilities without training multiple models.

Ax Fuling Chen, Kevin Vinsen, Phillip Melton, Rae-Chi Huang 19d ago

DeepPySR -- A Symbolic Regression Framework with Dynamic Pruning, Pareto Selection, and Hierarchical Composition for Real-World Scientific Discovery

Symbolic regression framework with dynamic pruning and Pareto selection for discovering interpretable equations from data.

Ax Sara Kangaslahti, Jonathan Geuter, Nihal V. Nayak, Marco Fumero, Francesco Locatello, David Alvarez-Melis 19d ago

Understanding Layer Patching in Model Size Interpolation

ML research on zero-shot model size interpolation via layer patching and boomerang distillation for language models.

Ax Lorenzo Pant\`e, Andrea Fanti, Roberto Capobianco 19d ago

Open-ended Multi-agent Autocurricula via Visual Inspection of Policies with Multi-modal LLMs

RL research using multi-modal LLMs to inspect agent policies and design open-ended curricula for training complex agents.

Ax Sumit Satishrao Shevtekar, Chandresh Kumar Maurya 19d ago

RhyMix: A Lightweight Adaptive Multi-Rhythm Network for Long-Term Time Series Forecasting

RhyMix lightweight adaptive network for time series forecasting capturing multiple simultaneous temporal patterns through multi-rhythm modeling.

Ax Ryan Thompson, Matt P. Wand, Veerabhadran Baladandayuthapani 19d ago

Structure Learning on Clustered Data

DAG structure learning approach for causal discovery on clustered data accounting for cluster-specific variations common in scientific applications.

Ax Sai Spandana Chintapalli, Pratik Chaudhari, Christos Davatzikos 19d ago

CASL-VAE: Learning Structured Latent Variables from Unpaired Data for Semi-supervised Clustering and Paired Sample Generation

CASL-VAE deep contrastive latent variable model learns structured generative factors from unpaired data for clustering and paired sample generation.

Ax Weiming Feng, Xiongxin Yang, Yixiao Yu, Yiyao Zhang 19d ago

Learning $\mathsf{AC}^0$ under Locally Sampleable Graphical Models

Theoretical work on learning constant-depth circuits under locally sampleable graphical models extending low-degree algorithm results to correlated distributions.

Ax Jiayi Fang 19d ago

Write-Protected Discrete Bottlenecks for Language-Grounded World Models: A Structural Limitation and Sufficient Fix

Analyzes structural limitations in language-grounded world models for robotics, identifying safety constraints for LLM/VLM feature integration with symbolic systems.

Ax Kaustubh Kumar, Ashutosh Ranjan, Vivek Srivastava, Blessin Varkey, Shirish Karande 19d ago

ArtMine: Discovering and Formalizing Artistic Processes

ArtMine system for discovering and formalizing artistic creative processes beyond finished artifacts using generative AI and iterative decision reasoning.

Ax Siyuan Wen, Jiahao Zeng, Ningning Ding 19d ago

AutoAnchor: Stable Diffusion Unlearning Using Cross-Attention as a Manifold Surrogate

AutoAnchor method for diffusion model unlearning using cross-attention as manifold surrogate to mitigate harmful/copyrighted content generation.

Ax Donghwan Lee 19d ago

Spectral Analysis of Dueling Q-Learning

Spectral analysis of dueling Q-learning algorithm extending Q-learning with value/advantage function decomposition for high-dimensional RL problems.

Ax Amir Asiaee 19d ago

Certified Interventional Fidelity: Anytime-Valid, Adaptive Evaluation of Causal Claims in Mechanistic Interpretability

Proposes certified interventional fidelity framework for anytime-valid adaptive evaluation of causal claims in mechanistic interpretability research.

Ax Matthias Wei{\ss}, Athreya Hosahalli Prakash, Maurice Artelt, Falk Dettinger, Nasser Jazdi, Michael Weyrich 19d ago

Self-Adaptive Anomaly Detection with Reinforcement Learning and Human Feedback in Connected Vehicles

Reinforcement learning approach for self-adaptive anomaly detection in connected vehicles handling system evolution from updates and configuration changes.

Ax Sebastian G. Gruber, Nassim Walha, Francis Bach, Florian Buettner 19d ago

Eigenvalue Calibration for Semantic Embeddings of Large Language Models

Framework for calibrating eigenvalues of semantic embeddings from LLMs for uncertainty quantification in reliable model deployment.

Ax Lachlan Ewen MacDonald, Ren\'e Vidal 19d ago

Dynamics of Gradient Descent with Large Step Size Near a Manifold of Flat Minima

Theoretical analysis of gradient descent dynamics with large step sizes near flat minima manifolds, addressing violations in deep neural network training.

Ax Hong Zhao 19d ago

Beyond Backpropagation: Monte Carlo Method Can Train Deep Neural Networks

Demonstrates gradient-free training of deep neural networks using Monte Carlo method on GPU, avoiding vanishing/exploding gradient problems of backpropagation.

Ax Le Yang (Institute for Advanced Simulations), Anoop K. Chandran (J\"ulich Supercomputing Centre, Forschungszentrum J\"ulich), Jona \"Ostreicher (Institute of Nanotechnology, Karlsruhe Institute of Technology), Evgenii Sovetkin (J\"ulich Supercomputing Centre, Forschungszentrum J\"ulich), Adrian Mirza (Helmholtz-Zentrum Berlin f\"ur Materialien und Energie, Helmholtz Institute for Polymers in Energy Applications Jena), Sebastien Bompas (Institute for Advanced Simulations), Bashir Kazimi (Institute for Advanced Simulations), Pascal Friederich (Institute of Nanotechnology, Karlsruhe Institute of Technology), Stefan Kesselheim (J\"ulich Supercomputing Centre, Forschungszentrum J\"ulich, 1. Physikalisches Institut, University of Cologne), Kevin Maik Jablonka (Helmholtz Institute for Polymers in Energy Applications Jena, Center for Energy and Environmental Chemistry Jena, Friedrich Schiller University Jena), Stefan Sandfeld (Institute for Advanced Simulations, Faculty 5 - Georesources and Materials Engineering, RWTH Aachen University) 19d ago

MatBind: A Shared Embedding Space for Multimodal Materials Characterization

MatBind creates shared embedding space for multimodal materials characterization integrating atomic structures, diffraction patterns, density of states, and language.

Ax Xia Cui, Ziyi Huang, N. R. Abeynayake 19d ago

Ensemble Diversity Optimization for Subjective Supervision

Proposes Ensemble Diversity Optimization (EDO) framework for NLP tasks with annotator disagreement, jointly optimizing ensemble weights and calibration via differentiable objective.

Ax Hafsa Mateen, Radu Timofte, Dmitry Ignatov 19d ago

Systematic Evaluation of Learning Rate Scheduling Strategies Across Heterogeneous Architectures

Study systematically evaluates learning rate scheduling strategies across 30 neural network architectures (CNNs and transformers) to understand impact on classification accuracy.

Ax Ofir Arviv, Kristjan Greenewald, Yotam Perlitz, Hadar Mulian, Michal Shmueli-Scheuer, Leshem Choshen 19d ago

Stop Guessing When to Stop Testing: Efficient Model Evaluation with Just Enough Data

Efficient model evaluation method using adaptive sample sizes to match diverse evaluation objectives and reduce computational cost.

Ax Xin Wang, Yunshi Wen, Yanan He, Haotian Xu, Youlan Zhao, Michel Ferreira Cardia Haddad, Tengfei Ma 19d ago

CAAD: Causality-Aware Multivariate Time Series Anomaly Detection via Multi-Scale Alignment and Structural Causal Consistency

Anomaly detection framework incorporating causal relationships and structural consistency for industrial system failure diagnosis.

Ax Dan Yamins, Aran Nayebi 19d ago