AIMER: Calibration-Free Task-Agnostic MoE Pruning
Calibration-free pruning method for Mixture-of-Experts language models to reduce memory and serving overhead.
Calibration-free pruning method for Mixture-of-Experts language models to reduce memory and serving overhead.
Policy optimization approach addressing overthinking in large reasoning models through difficulty-differentiated training.
Study on synthetic data augmentation for efficient pre-training with better loss scaling using synthetic megadocs.
Research on active auditing framework against backdoor attacks in decentralized federated learning systems.
GAPSL: gradient-aligned parallel split learning for federated learning on heterogeneous data, reducing client computational load.
Transfers statistical methods from particle physics for UAV propeller fault detection using spectral features and neural inference.
SINDy-KANs combines Kolmogorov-Arnold networks with sparse identification to learn interpretable equations for nonlinear dynamical systems.
Shows Transformers learn robust in-context regression under distributional uncertainty without restrictive assumptions on data and noise.
SpecForge: open-source training framework for speculative decoding draft models, improving LLM inference latency through token batching.
Demonstrates adversarial attacks on GNNs exploitable through unlearning mechanisms designed for GDPR compliance in graph learning systems.
Systematic analysis of Elastic Weight Consolidation for continual learning, identifying issues with importance estimation and weight regularization methods.
Evaluates model-free policy optimization algorithms using exact blackjack oracle with ground-truth benchmarks for discrete stochastic control.
Investigates multi-corpus training in speech spoofing detection using self-supervised learning, finding domain-specific biases harm generalization.
Studies label inference attacks in vertical federated learning, analyzing vulnerabilities when passive parties infer active party's labels and proposing defenses.
HISR proposes segmental process rewards for multi-turn RL in LLM agents, addressing sparse reward propagation and credit assignment in long-horizon decision-making tasks.
Investigates transfer learning from audio and time-series foundation models to scientific time-series via cross-domain distillation.
Proposes OCP method for improving item embeddings in large-scale commodity recommendation systems.
Studies off-policy learning in contextual bandits with supply constraints for recommendation and advertising systems.
Causal-theoretic approach for reward modeling using observational user feedback instead of expensive annotated data for RLHF alignment.
Ablation study examining necessity of components in Group Relative Policy Optimization for teaching LLMs reasoning and mathematical ability.
Deep VAE-GAN approach improving reservoir parameterization for data assimilation in petroleum reservoir simulation.
AutoPipe framework for automatically configuring LLM post-training pipelines combining supervised fine-tuning and reinforcement learning under budget constraints.
Study on using discriminators to enhance generative model training across GANs, weak learner frameworks, and diffusion models.
Method mitigating asynchronous data drift in federated learning where different devices experience different distribution shifts.
Neuroscience framework introducing authority-level priors to hierarchical predictive processing for understanding autonomic regulation.
Theoretical error analysis of Adam optimizer for training deep neural networks and beyond, addressing open research gaps.
Framework using normalizing flows to approximate diffusion process transition probability densities by solving Fokker-Planck equations.
Ensemble framework for loan default prediction handling nonlinear relationships and class imbalance in financial datasets.
Method augmenting reinforcement learning from verifiable rewards with context bootstrapping to improve exploration and reasoning pattern acquisition.
ML framework for anomaly detection in power plant monitoring systems balancing performance and fairness across regions.
Bayesian model for drug discovery incorporating variable selection and side information through inductive matrix completion.
Intrinsic reward method for reinforcement learning agents maximizing entropy of future state-action visitation distributions.
Novel symmetric Turing Test variant where groups of LLMs and humans interact, judge, and respond in time-bounded discussions.
Benchmarking study comparing AI agents' performance to human experts on domain-specific data science tasks, evaluating LLM-based automation of data science workflows.
Analysis of differential privacy guarantees and convergence in wireless federated learning without restrictive convexity assumptions.
CoMFed framework for communication-efficient federated learning with heterogeneous multimodal clients and privacy preservation.
Theoretical analysis questioning foundations of Spectral Graph Neural Networks for node classification tasks.
Study of multimodal jailbreak attacks on Spoken Language Models using gradient-based optimization across text and audio modalities.
Research on efficiency metrics for Vision-Language-Action embodied agents, showing that parameter/FLOP counts don't reflect real robotic platform performance.
Stock prediction framework using autoencoders and transformers with reinforcement learning for adaptive market regime detection.
Hierarchical Bayesian model for online latent-cause inference balancing generalization and discrimination in learning.
SHAPCA: interpretability framework combining SHAP and PCA for explainable ML on high-dimensional spectroscopy data.
Continual learning method using random projection layers with pretrained models for improved representation learning.
DyMoE: dynamic expert selection with mixed-precision quantization for efficient MoE model inference on edge devices.
SOL-ExecBench: 235-problem benchmark for CUDA kernel optimization against hardware efficiency limits for agentic AI systems.
MIDST challenge evaluating membership inference attacks on synthetic tabular data generated by diffusion models.
Statistical method for improving treatment effect estimation by aligning RCTs and observational studies under covariate mismatch.
Security analysis of phishing detectors examining evasion costs and robustness under adversarial feature manipulation.
Online learning algorithms for sequential decision-making with ranking feedback instead of numeric utility feedback.
CONSTRUCT method for real-time trustworthiness scoring of LLM structured outputs and field-level error detection.