Accelerated Learning with Linear Temporal Logic using Differentiable Simulation
Integration of linear temporal logic specifications into RL using differentiable simulation for safe, correct-by-construction controller synthesis.
Integration of linear temporal logic specifications into RL using differentiable simulation for safe, correct-by-construction controller synthesis.
Noise-robust exploration method for RL using learning progress monitoring to escape unlearnable noise sources with improved sample efficiency.
ARMOR: one-shot post-training pruning algorithm for LLMs achieving 2:4 semi-structured sparsity with minimal performance degradation for efficient deployment.
Theoretical analysis of convergence guarantees for decentralized stochastic gradient descent with high-probability bounds and reduced assumptions.
Extension of Forward-Forward algorithm to reinforcement learning with action-conditioned Q-functions, replacing backpropagation with local learning.
f-INE framework for stable influence estimation under training randomness, addressing instability in sample-level impact estimation for data curation.
Dataset distillation method leveraging diffusion models as priors to synthesize compact, representative datasets with improved diversity and generalization.
Deep learning approach for antimicrobial peptide design using semi-supervised latent Bayesian optimization with improved interpretability.
MusicRFM framework enabling fine-grained control over pre-trained music generation models by steering internal activations via Recursive Feature Machines.
Bayesian parameter inference method for complex stochastic simulators using differentiable approaches to reduce simulation costs in high-dimensional spaces.
Goal-driven reward signals from pretrained video diffusion models for reinforcement learning agent training.
Distillation-based continual learning with classifier-proximal plugins addressing stability-plasticity tradeoff.
Method to robustify activation sparsity in LLMs by addressing representational instability during inference acceleration.
Machine learning approach for early chronic kidney disease screening in low-resource settings using explainable models.
Scalable multi-concept unlearning in text-to-image diffusion models addressing weight conflicts and collateral damage.
Analysis of extreme variance in certified robustness verification across neural network model seeds.
Textual Equilibrium Propagation for optimizing compound AI systems with multiple modules in long-horizon agentic workflows.
Early classification of time series in non-stationary environments with uncertain and time-varying decision costs.
ChronoSpike: Adaptive spiking graph neural network for dynamic graph representation learning with event-driven efficiency.
Unified training-serving system combining RL with adaptive speculative decoding for accelerated LLM inference.
Infusion: Framework using influence functions to craft training data perturbations that induce targeted model behavior changes.
Uncertainty quantification for machine learning interatomic potentials using evidential deep learning.
Geometric analysis of optimization dynamics in transformers trained on modular arithmetic revealing low-dimensional subspaces.
Study on early-warning signals of grokking via loss-landscape geometry on SCAN and Dyck-1 benchmarks.
CeRA: Parameter-efficient fine-tuning method extending LoRA with non-linear capacity expansion via gating and dropout.
Physics-informed neural operators for solving PDEs with improved generalization beyond training distributions.
SafeSci: Framework for evaluating safety of large language models in scientific domains with comprehensive benchmarks.
CRISP: Method for teaching LLMs to reason more concisely via self-distillation with 'be concise' conditioning.
Stock market prediction using Node Transformer and BERT sentiment analysis for financial forecasting.
WinDiNet uses pretrained video diffusion model as differentiable physics simulator for urban wind flow prediction, replacing expensive CFD simulations.
λ-GELU parameterized gating function enabling controlled ReLU conversion while maintaining smooth activation properties for deployment.
ERPO method for token-level credit assignment in LLM reasoning models, addressing entropy collapse in GRPO through information heterogeneity.
Recurrent network training without Jacobian propagation using hidden state temporal credit. Studies gradient normalization and online adaptation.
Mathematical framework explaining phase transitions in neural network training via spectral gap of parameter update Gram matrices. Grokking and capability gains analysis.
Transfer learning for nonparametric Bayesian networks under scarce data. Proposes PC-stable-transfer and hill climbing transfer learning methods.
Tutorial on Bayesian Optimization for automating scientific discovery using surrogate models and probability-driven frameworks.
annbatch: mini-batch loader for terabyte-scale biological data in AnnData format, addressing memory bottlenecks in ML training on large datasets.
arXiv paper developing asymptotic theory for quantile estimation via stochastic gradient descent with constant learning rate.
arXiv paper introducing MAPP mechanism for efficient data marketplace pricing using learned value distributions.
arXiv paper on gen2seg: using generative models (Stable Diffusion, MAE) for category-agnostic instance segmentation.
arXiv paper proposing LMask, a learning framework using dynamic masking for constrained routing problems optimization.
arXiv paper comparing deep learning neural networks against statistical methods for solving ODE inverse problems.
arXiv paper analyzing tokenized U.S. Treasuries transactions on blockchain infrastructure.
arXiv paper on constrained free energy minimization for quantum thermodynamic system design.
arXiv paper analyzing 150+ years of German parliamentary migration debates using LLMs, revealing shift from post-war solidarity to anti-solidarity.
arXiv paper on ROPA: synthetic robot pose generation for RGB-D bimanual data augmentation to improve imitation learning policies.
Algorithm for column subset selection using adaptive randomized pivoting with connections to volume sampling.
Forecasting data movement patterns in MoE LLM inference to reduce bottlenecks in multi-unit serving systems.
Fast regret bounds for contextual bandits without realizability assumptions using pessimistic policy updates.
Seer: context learning RL system for fast synchronous LLM training, addressing rollout latency and resource utilization.