Tensor-Efficient High-Dimensional Q-learning
Tensor-based Q-learning approach to handle high-dimensional reinforcement learning by exploiting problem structure without neural networks.
Tensor-based Q-learning approach to handle high-dimensional reinforcement learning by exploiting problem structure without neural networks.
Adaptive symmetrization of KL divergence for learning probability distributions with normalizing flows and energy-based models.
SpecQuant framework for ultra-low-bit LLM quantization using spectral decomposition and adaptive truncation for efficient device deployment.
TREASURE foundation model for payment transaction understanding and analysis with applications to anomaly detection.
CHiQPM provides global and local interpretability for image classification in safety-critical domains with hierarchical explanations.
Adaptive Replay Buffer (ARB) dynamically prioritizes data sampling in offline-to-online reinforcement learning to balance stability and asymptotic performance.
First empirical study of machine unlearning in hybrid quantum-classical neural networks and variational quantum circuits.
Reinforcement learning framework to learn weather/climate model parametrization schemes as state-dependent functions online instead of using fixed coefficients.
Low-Rank Key-Value (LRKV) attention reduces transformer KV cache memory by exploiting redundancy across attention heads with low-rank residuals.
BadImplant introduces multi-targeted backdoor attacks against graph neural networks with injection-based mechanisms.
Explainable AI methods to improve ML reliability and prevent unexpected behavior in industrial cyber-physical systems.
SPICE uses submodular optimization and Fisher information to select training data for efficient LLM instruction tuning while addressing gradient conflicts.
Infusion framework uses influence functions to craft training data perturbations that induce targeted model behavior changes, evaluated on vision and language tasks.
Open-source foundation model for 3D molecular and materials modeling with both generative and predictive capabilities.
Interventional time series data generator for training causal foundation models on time series, extending prior-data fitted networks to temporal domains.
EvoFlows: variable-length protein sequence model using flow matching for protein engineering with native support for insertions, deletions, and mutations.
CONSERVAttack method for identifying vulnerabilities and systematic uncertainties in ML models applied to high-energy physics data analysis.
Non-parametric conformal regression method using optimal binning with CRPS loss for conditional distribution estimation.
MR-CDM: multi-resolution conditional diffusion framework for variable-length time series forecasting with hierarchical decomposition and adaptive embeddings.
Data-driven sports training framework using skeleton-based biomechanical analysis and motion modeling for personalized dart coaching.
Open-source benchmark and reproducible implementation of Matrix Profile methods for univariate and multivariate time-series anomaly detection.
On-policy self-distillation approach for LLM training combining dense teacher signals with sparse verifiable rewards from environment feedback.
NativeTernary: binary encoding format for ternary neural network weights achieving 2 bits per weight, 1.31x compression over GGUF for BitNet models.
Proposes k-maximum inner product attention mechanism for graph transformers to reduce computational complexity while maintaining expressive power.
Deep learning approach for clinical risk prediction from incomplete multimodal EHR data using point cloud paradigm to handle irregular sampling and missing modalities.
Empirical robustness analysis of TabPFN's attention mechanisms for tabular in-context learning, examining noise immunity across heterogeneous datasets.
Active inference methodology for ML-assisted data collection, using models to identify which points merit labeling under budget constraints for efficient learning.
Studies linearization of discrete transportation distance on graphs, connecting optimal transport to graph structure and providing nonasymptotic analysis.
Develops Thompson Sampling theory for discounted infinite-horizon MDPs with Borel state/action spaces and unknown parameters using canonical probability space framework.
Studies best-arm identification with differential privacy guarantees in local and central models for privacy-sensitive applications like clinical trials and hyperparameter tuning.
RL framework studying how children learn numbers using base-ten blocks, investigating numerical cognition through reinforcement learning and neural networks.
Provides convergence analysis and minimax optimality guarantees for kernel instrumental variable regression in both identified and non-identified settings.
Finite-time analysis of two-time-scale stochastic approximation algorithms with non-expansive mappings for optimization, reinforcement learning, and control applications.
LongSpec accelerates LLM inference on long contexts via lossless speculative decoding with efficient drafting and verification, targeting agent-based applications.
Spike-based alignment learning resolves weight transport problem in neural networks, enabling local computation compatible with biological networks and neuromorphic hardware.
AHCQ-SAM addresses post-training quantization challenges for Segment Anything Model to enable efficient deployment on resource-constrained edge devices.
Analyzes computational bottlenecks in denoising diffusion models, examining efficiency of drift learning and sampling procedures for probability distribution approximation.
Study of 14 LLMs showing mathematical reasoning accuracy drops 0.3-5.9% when math problems are culturally contextualized, revealing model limitations beyond pure logic.
Active Perception Learner (Apple) applies reinforcement learning to enable general active perception in robotic systems with sparse, local sensory information.
LongWriter-Zero uses reinforcement learning to improve ultra-long text generation in LLMs, overcoming length limitations and quality degradation without relying on synthetic training data.
Neural stochastic optimization method for solving two-stage unit commitment problems using deep networks to approximate recourse costs under high-dimensional uncertainty.
MF-GLaM develops a multifidelity stochastic emulator using generalized lambda models for simulating conditional probability distributions in scientific computing.
AugLift improves 3D pose estimation from 2D keypoints using depth-aware input reparameterization and foundation models for better domain generalization.
ShadowNPU optimizes on-device LLM inference by addressing quantization sensitivity in attention operators, enabling efficient NPU execution for privacy-preserving deployment.
Generative AI pipeline for synthetic building data creation addressing scarcity in residential energy modeling datasets.
Framework enabling LLMs to maintain alignment across sequential preference updates without catastrophic forgetting using memory-augmented optimization.
PAC-Bayesian generalization bounds using constrained f-entropic risk measures for handling subgroup imbalances and distributional shifts.
Approach for learning symbolic world models from single-episode exploration in stochastic environments without human guidance.
Method for cost-efficient reinforcement learning on LLMs using preemptible cloud resources, optimizing rollout and training stages separately.
Framework for transferring knowledge from rich sensor modalities to deployable sensors in embodied AI systems using multi-sensory learning.