OPO-CMDP presents first policy optimization algorithm for contextual MDPs with general offline function approximation achieving near-optimal regret bounds.
HBVLA applies 1-bit post-training quantization to vision-language-action models for efficient deployment on resource-constrained robots and edge devices.
Bi-level optimization framework using neural networks for operational optimization of thermal power systems with hierarchical variables.
Novel matrix-free eigendecomposition method using discrete double-bracket flows that is invariant to isotropic noise shifts.
Study of instruction-tuning data selection for LLMs using semantic representation similarity to identify redundancy in large-scale datasets.
MEMTS introduces parameterized memory for domain adaptation of time series foundation models to handle temporal distribution shifts and domain-specific patterns.
MechPert predicts transcriptional responses to unseen genetic perturbations using mechanistic consensus as inductive bias, combining knowledge graphs with LLM reasoning.
Cast-R1 applies tool-augmented sequential decision policies and iterative reasoning to time series forecasting, enabling autonomous evidence acquisition and prediction revision.
Physics-driven Fourier-spectral solver for untrained neural networks applied to electromagnetic inverse scattering via spectral-domain optimization.
AnomaMind uses agentic reasoning with tool augmentation for time series anomaly detection, framing it as evidence-driven diagnosis rather than fixed discriminative prediction.
Mean velocity policy for reinforcement learning enabling fast one-step action generation with velocity constraints.
Variational flow-matching framework for simulation-based inference respecting structured domains with discrete-continuous variables.
Foundation model for multimodal sleep biosignals handling heterogeneous devices and sensor dropout for sleep staging.
Theoretical analysis of algorithmic stability and generalization error bounds for minimum-norm deep ReLU networks.
Graph neural network benchmark for repository-level bug localization using code graph structure beyond standard LLM context windows.
Analysis comparing code generation learnability to reinforcement learning, proposing hierarchy of feedback quality as ML progress ceiling.
Multi-agent AutoML framework using LLM-based code generation with modular architecture to reduce hallucinations and improve verifiability.
Adaptive model selection framework for demand forecasting across multiple SKUs and planning horizons.
End-to-end learnable tokenization for LLMs using reinforcement learning instead of hardcoded compression steps.
Training paradigm embedding experience replay in reinforcement learning for LMs to learn from sparse, delayed environmental feedback.
Quantized reinforcement learning for LLM training that accelerates rollout efficiency by 30% using quantized actor networks.
State-space models (Mamba) applied to natural product chemistry for molecular property prediction and generation.
Theoretical analysis of constant-stepsize stochastic approximation with Gaussian approximations and tail bounds for convergence.
Multimodal benchmark for evaluating climate forecasting services using large language models for decision-making under uncertainty.
Foundation model for time series that uses latent-space predictive learning instead of direct future value prediction.
Spatio-temporal forecasting framework for traffic prediction under structural and observational uncertainties in transportation networks.
Position encoding technique using random float sampling to improve transformer length generalization beyond pretraining sequence lengths.
Federated learning system for edge devices using energy harvesting to reduce battery depletion during collaborative model training.
Parameter-efficient fine-tuning for vision models using policy gradient with adaptive entropy annealing to prevent catastrophic forgetting in class-incremental learning.
Theoretical analysis of Neural Optimal Transport in Hilbert spaces, addressing spurious solutions through regular measures framework and Gaussian smoothing.
Physics-informed PointNets and geometry-aware neural operators for modeling flows across porous structures with coupled physics and diverse geometries.
Sanity checks validating whether sparse autoencoders recover meaningful features beyond random baselines for neural network interpretability.
ROAST uses on-distribution rollouts for parameter-efficient LLM activation steering at inference time, replacing off-distribution supervision with continuous soft scaling.
Plug-and-play regularization losses for Mixture-of-Experts models promoting expert specialization across intra- and cross-layers without structural modifications.
Comprehensive analysis of malicious prompt classifier robustness under distribution shift with 18 datasets spanning jailbreaks and prompt injections for LLM agents.
Pivot-driven resampling technique for deep dense exploration in LLM RL, discovering high-quality trajectories within limited sampling budget from language space.
TS-Haystack benchmark evaluates time series language models on long-context retrieval with millions of datapoints, requiring precise temporal localization.
Characterizes optimal batch size scheduling for large-scale deep learning under fixed data budget using functional scaling law framework.
MAGE optimizes KV cache memory access in block diffusion LLMs for long-context settings using dynamic sparse attention adapted to block diffusion uniqueness.
RMB-CLE framework for multi-task learning integrating error-based task clustering with local ensembling to mitigate negative transfer from unrelated tasks.
Analysis identifying five recurring biases in financial LLM applications: look-ahead, survivorship, narrative, objective, and cost bias that invalidate deployment claims.
MAD framework treats tabular anomaly detection as multi-agent debate, leveraging disagreement from heterogeneous model families under distribution shift and rare-anomaly regimes.
Transfer learning approach using LSTM for cross-household hot water demand forecasting to optimize heat pump operation and reduce energy waste.
Radial-VCReg augments VCReg with radial Gaussianization loss for improved self-supervised representation learning by aligning feature norms with Chi distribution.
Framework leveraging transformer-based language models for causal inference from unstructured text, comparing estimates against structured data baselines.
Testing methodology for AI/ML and quantum systems addressing high-dimensional inputs, probabilistic outputs, and evaluation of trustworthiness, fairness, and robustness.
Framework for adaptive multi-turn LLM interactions to efficiently elicit group-level information from surveys, optimizing respondent selection and questioning strategy.
KernelBlaster uses agentic workflows with in-context RL to optimize CUDA code across GPU architectures, aggregating knowledge from prior optimizations without expensive finetuning.
MLAT framework exposes pre-trained ML models as callable tools within LLM agent workflows, enabling agents to invoke quantitative predictions and reason about outputs contextually.
Federated learning approach (DeepFusion) for training MoE-based LLMs using knowledge distillation from heterogeneous edge devices, enabling privacy-preserving distributed training.