Isolater - Feed

Ax Renjini R. Nair (Microsoft), Damian K. Kowalczyk (Microsoft), Marco Gaudesi (Microsoft), Chhaya Methani (Microsoft) 9d ago

SLM Finetuning for Natural Language to Domain Specific Code Generation in Production

Fine-tuning small language models for domain-specific code generation in production environments with strict latency requirements.

Ax James Nguyen 9d ago

From Recency Bias to Stable Convergence Block Kaczmarz Methods for Online Preference Learning in Matchmaking Applications

Kaczmarz-based preference learning algorithms for real-time matchmaking with stable convergence replacing recency-biased normalization.

Ax Ziyue Liu, Ruijie Zhang, Zhengyang Wang, Yequan Zhao, Yupeng Su, Zi Yang, Zheng Zhang 9d ago

Muon$^2$: Boosting Muon via Adaptive Second-Moment Preconditioning

Extension of Muon optimizer reducing computational overhead in foundation model pre-training through adaptive second-moment preconditioning.

Ax Wei Liu, Anweshit Panda, Ujwal Pandey, Haven Cook, George M. Slota, Naigang Wang, Jie Chen, Yangyang Xu 9d ago

LoDAdaC: a unified local training-based decentralized framework with adaptive gradients and compressed communication

Decentralized learning framework combining adaptive gradients and compressed communication for federated settings with multiple local training steps.

Ax Kening Wang, Di Wen, Yufan Chen, Ruiping Liu, Junwei Zheng, Jiale Wei, Kailun Yang, Rainer Stiefelhagen, Kunyu Peng 9d ago

Towards Multi-Source Domain Generalization for Sleep Staging with Noisy Labels

First benchmark for multi-source domain generalization in automatic sleep staging with noisy labels across institutions and devices.

Ax Chi Zhang, Jingpu Cheng, Zhixian Wang, Ping Liu 9d ago

Closed-Form Concept Erasure via Double Projections

Closed-form method for concept erasure in diffusion models using double projections without iterative optimization.

Ax Prakash Suman, Yanzhen Qu 9d ago

Cross-Validated Cross-Channel Self-Attention and Denoising for Automatic Modulation Classification

Cross-validated self-attention with denoising for automatic modulation classification under low signal-to-noise conditions.

Ax Jose Efraim Aguilar Escamilla, Haoyang Hong, Jiawei Li, Haoyu Zhao, Xuezhou Zhang, Sanghyun Hong, Huazheng Wang 9d ago

When Can You Poison Rewards? A Tight Characterization of Reward Poisoning in Linear MDPs

Characterizes necessity and sufficiency conditions for reward poisoning attacks in reinforcement learning with linear MDPs.

Ax Yujie Li, Jiuniu Wang, Mugen Peng, Guangzuo Li, Wenjia Xu 9d ago

Graph-RHO: Critical-path-aware Heterogeneous Graph Network for Long-Horizon Flexible Job-Shop Scheduling

Heterogeneous graph network with critical-path awareness for long-horizon flexible job-shop scheduling using rolling horizon optimization.

Ax Hongkang Li, Hancheng Min, Rene Vidal 9d ago

Transformers Learn the Optimal DDPM Denoiser for Multi-Token GMMs

Theoretical analysis of why transformers learn optimal DDPM denoiser for multi-token Gaussian mixture models.

Ax Zunhai Su, Hengyuan Zhang, Wei Wu, Yifan Zhang, Yaxiu Liu, He Xiao, Qingyao Yang, Yuxuan Sun, Rui Yang, Chao Zhang, Keyu Fan, Weihao Ye, Jing Xiong, Hui Shen, Chaofan Tao, Taiqiang Wu, Zhongwei Wan, Yulei Qian, Yuchen Xie, Ngai Wong 9d ago

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Survey on attention sink phenomenon in transformers, covering utilization, interpretation, and mitigation strategies.

Ax Francesco Carlucci, Giovanni Pollo, Xiaying Wang, Massimo Poncino, Enrico Macii, Luca Benini, Sara Vinco, Alessio Burrello, Daniele Jahier Pagliari 9d ago

End-to-end Automated Deep Neural Network Optimization for PPG-based Blood Pressure Estimation on Wearables

Automated DNN optimization for PPG-based blood pressure estimation on resource-constrained wearable devices.

Ax Yogesh Prasanna Kumar Rao, Tamas Keviczky, Raj Thilak Rajan 9d ago

Consensus-based Recursive Multi-Output Gaussian Process

Distributed consensus-based framework for recursive multi-output Gaussian processes in large-scale streaming settings.

Ax Ami Chopra, Supriya Bordoloi, Shyamanta M. Hazarika 9d ago

A Temporally Augmented Graph Attention Network for Affordance Classification

Temporally augmented graph attention network for affordance classification from EEG sequential data.

Ax Rui Lin, Zhenyu Jin, Guancheng Zhou, Xuyang Ge, Wentao Shu, Jiaxing Wu, Junxuan Wang, Zhengfu He, Junping Zhang, Xipeng Qiu 9d ago

Tracing the Thought of a Grandmaster-level Chess-Playing Transformer

Interprets internal computation of Leela Chess Zero transformer using sparse decomposition to explain grandmaster-level reasoning.

Ax Keivan Faghih Niresi, Christian M{\o}ller Jensen, Carsten Skovmose Kalles{\o}e, Rafael Wisniewski, Olga Fink 9d ago

Virtual Smart Metering in District Heating Networks via Heterogeneous Spatial-Temporal Graph Neural Networks

Spatial-temporal graph neural networks for virtual metering in sparsely instrumented district heating networks.

Ax Yuto Omae, Kazuki Sakai, Yohei Kakimoto, Makoto Sasaki, Yusuke Sakai, Hirotaka Takahashi 9d ago

Wolkowicz-Styan Upper Bound on the Hessian Eigenspectrum for Cross-Entropy Loss in Nonlinear Smooth Neural Networks

Theoretical bounds on Hessian eigenspectrum for cross-entropy loss in nonlinear neural networks.

Ax Shihong Ding, Weicheng Lin, Cong Fang 9d ago

Mild Over-Parameterization Benefits Asymmetric Tensor PCA

Theoretical analysis of asymmetric tensor PCA showing gradient descent benefits from mild over-parameterization.

Ax Joana Sim\~oes, Jo\~ao Correia 9d ago

Exploring the impact of fairness-aware criteria in AutoML

Studies fairness-aware criteria in automated machine learning frameworks to mitigate bias and discriminatory outcomes.

Ax Yuqi Su, Xiaolei Fang 9d ago

A Multi-head Attention Fusion Network for Industrial Prognostics under Discrete Operational Conditions

Multi-head attention fusion network for predicting degradation of industrial machinery operating under changing conditions.

Ax Mani Rash Ahmadi 9d ago

The Phase Is the Gradient: Equilibrium Propagation for Frequency Learning in Kuramoto Networks

Theoretical analysis proving phase displacement in Kuramoto oscillator networks equals gradient of loss for frequency learning.

Ax Jie Shi, Siamak Mehrkanoon 9d ago

A Diffusion-Contrastive Graph Neural Network with Virtual Nodes for Wind Nowcasting in Unobserved Regions

Graph neural network with diffusion-contrastive learning for wind nowcasting in regions lacking dense observation networks.

Ax Adil Derrazi, Javad Pourmostafa Roshan Sharami 9d ago

Integrating SAINT with Tree-Based Models: A Case Study in Employee Attrition Prediction

Combines SAINT attention mechanism with tree-based models like XGBoost for improved employee attrition prediction on tabular HR data.

Ax Jiaqi Wen, Pingbo Tang, Shaolei Ren, Jianyi Yang 9d ago

WaterAdmin: Orchestrating Community Water Distribution Optimization via AI Agents

AI agents for optimizing community water distribution systems by scheduling pumps and valves to meet demands while minimizing energy in dynamic real-world environments.

Ax Muhammad Imran Hossain, Md Fazley Rafy, Sarika Khushlani Solanki, Anurag K. Srivastava 9d ago

Battery health prognosis using Physics-informed neural network with Quantum Feature mapping

Combines physics-informed neural networks with quantum feature mapping for battery state-of-health estimation across chemistries.

Ax Rui Chen, Jinsong Wu 9d ago

Structural Gating and Effect-aligned Lag-resolved Temporal Causal Discovery Framework with Application to Heat-Pollution Extremes

Proposes SGED-TCD framework for lag-resolved causal discovery in multivariate time series with applications to environmental data.

Ax Zhe Ye, Aidan Z. H. Yang, Huangyuan Su, Zhenyu Liao, Samuel Tenka, Zhizhen Qin, Udaya Ghai, Dawn Song, Soonho Kong 9d ago

Intent-aligned Formal Specification Synthesis via Traceable Refinement

Presents VeriSpecGen for automatic formal specification synthesis from natural language using LLMs with traceability for code verification.

Ax Eric Easley, Sebastian Farquhar 9d ago

Latent Instruction Representation Alignment: defending against jailbreaks, backdoors and undesired knowledge in LLMs

Introduces LIRA method to defend LLMs against jailbreaks, backdoors, and unlearning by training models to align instruction representation.

Ax Elahe Khatibi, Ziyu Wang, Ankita Sharma, Krishnendu Chakrabarty, Sanaz Rahimi Moosavi, Farshad Firouzi, Amir Rahmani 9d ago

CARE-ECG: Causal Agent-based Reasoning for Explainable and Counterfactual ECG Interpretation

Proposes CARE-ECG, causal agent-based reasoning framework for explainable ECG interpretation combining LLMs with physiological structure.

Ax Ziyu Wang, Elahe Khatibi, Ankita Sharma, Krishnendu Chakrabarty, Sanaz Rahimi Moosavi, Farshad Firouzi, Amir Rahmani 9d ago

Membership Inference Attacks Expose Participation Privacy in ECG Foundation Encoders

Demonstrates membership inference attacks on ECG foundation encoders, exposing participation privacy risks in self-supervised pretraining.

Ax Naichuan Zheng, Hailun Xia, Zepeng Sun, Weiyi Li, Yinze Zhou 9d ago

Towards Green Wearable Computing: A Physics-Aware Spiking Neural Network for Energy-Efficient IMU-based Human Activity Recognition

Proposes physics-aware spiking neural networks for energy-efficient wearable IMU-based human activity recognition on edge devices.

Ax Candi Zheng, Yuan Lan 9d ago