Efficient, Property-Aligned Fan-Out Retrieval via RL-Compiled Diffusion
RL-based method for retrieving diverse, property-aligned result sets using diffusion models for set-valued retrieval objectives.
RL-based method for retrieving diverse, property-aligned result sets using diffusion models for set-valued retrieval objectives.
Dataset and neural network approach for radiomap prediction in 6G wireless systems with extra-large MIMO arrays.
Reference architecture framework analyzing 18 RL implementations to establish common patterns and standardization for RL frameworks.
Continual learning strategy for online adaptation of interactive segmentation models in medical imaging with low-parameter updates.
Method for computing certified bounds on function space norms of deep neural networks applied to PDE solutions.
Theoretical analysis of quantum diffusion models and score reversal in Gaussian dynamics with complete positivity constraints.
Optimization technique using semantic-aware caching to improve instance retrieval efficiency in concept learning on knowledge bases.
ML system trained on 45k+ ultrasound images to detect fetal orofacial clefts prenatally, addressing shortage of specialist expertise.
Two-stage hybrid framework combining logical options with deep reinforcement learning to improve agent alignment and prevent over-exploitation of early reward signals.
SCOPE incremental few-shot 3D point cloud segmentation addressing catastrophic forgetting by leveraging unlabeled background scenes in sparse supervision settings.
BEVLM distills semantic knowledge from LLMs into bird's-eye view representations for autonomous driving with improved spatial consistency and reduced computation.
Cognitive explainer using medical concepts to interpret deep neural network decisions for fetal ultrasound standard scan plane detection with transparency and interpretability.
Expert-aided causal discovery method incorporating background knowledge ex-post to improve reliability of ancestral graph discovery while minimizing expert probing costs.
Unified framework for learning with nonlinear model classes from arbitrary linear measurements in Hilbert spaces with novel data-dependent learning guarantees.
Energy-dissipation analysis for neuromorphic learning-in-memory optimizers using compute-in-memory paradigm to address memory-wall and energy bottlenecks.
Tutorial and survey on predictive coding networks based on neuroscientific framework viewing brain as hierarchical Bayesian inference model minimizing prediction errors.
PACE combines parameter-efficient fine-tuning with consistency regularization to improve generalization by reducing gradient norms during transformer adaptation to downstream tasks.
L0-regularized kernel-free quadratic surface SVMs addressing overfitting and interpretability by reducing model parameters that scale quadratically with dimensionality.
FragFM hierarchical framework using fragment-level discrete flow matching and coarse-to-fine autoencoders for efficient scalable molecular graph generation.
System-level DPO method for aligning compound AI systems with multiple interacting components including LLMs, foundation models, and external tools to human preferences.
Context-Aware Priority Sampling using VQ-VAEs to improve data efficiency and handle imbalanced datasets in imitation learning for autonomous driving systems.
Controlled study examining how LLM tokenizer bias and backbone capability affect time series forecasting performance using pre-trained language models as backbone.
Federated Learning survey covering privacy-preserving distributed machine learning enabling multiple clients to collaboratively train models without centralizing sensitive data.
FourierSpecNet hybrid framework combining Fourier spectral methods with neural networks to approximate collision operators for solving the Boltzmann equation efficiently.
Asynchronous-to-synchronous paradigm for event cameras using learned feature encoding to improve expressivity and generalizability for sparse sequential visual data.
DejaVu attack exploiting temporal misalignment vulnerabilities in multimodal fusion systems for autonomous driving by manipulating camera and LiDAR synchronization.
State Space Neural Operator for learning solution operators of time-dependent PDEs using structured state space models with adaptive damping and learnable frequency modulation.
arXiv paper providing theoretical analysis of GRPO (Group Relative Policy Optimization) for LLM fine-tuning from human feedback.
arXiv paper analyzing EM algorithm behavior under model misspecification in mixture models with excess components.
arXiv paper analyzing GNN-based SAT solvers through graph Ricci curvature geometric perspective to explain performance degradation.
arXiv paper on efficient world models for heterogeneous multi-task planning, addressing gradient conflicts and plasticity loss.
arXiv paper on assessing performance of language model applications in healthcare, addressing evaluation methodology.
arXiv paper introducing Answer-Then-Check safety alignment method to defend LLMs against jailbreak attacks using reasoning.
arXiv paper on prompt-based federated continual learning addressing class-wise and temporal forgetting across distributed clients.
arXiv paper applying auto-regressive U-Net for predicting time-dependent damage in concrete. Domain-specific, limited AI/ML generality.
arXiv paper on training diffusion language models with planner-aware path learning to optimize generation strategies.
arXiv paper formulating diffusion model alignment as variational EM to reduce reward over-optimization and mode collapse.
arXiv paper on opinion dynamics optimization using low-rank matrix bandits in online settings. Social dynamics focus, limited AI/ML relevance.
arXiv paper on adapting decoder-only LLMs to partial differential equations via cross-modal learning for scientific machine learning.
arXiv paper introducing KLASS sampling method for masked diffusion models using token-level KL divergence to accelerate inference.
SQDF applies soft Q-function RL to fine-tune diffusion models with KL regularization, mitigating reward over-optimization.
Analyzes diversity loss in RL-trained LLMs caused by mode-seeking reverse KL; proposes forward KL filtering for reasoning tasks.
A-3PO accelerates asynchronous LLM RL training with staleness-aware proximal policy approximation, improving over decoupled PPO.
Data-driven sensitivity analysis using Individual Conditional Expectations for interpreting black-box models in engineering design.
Analysis of optimization challenges in hyperbolic deep RL identifying gradient factors affecting training success for hierarchical state embeddings.
CARE failure-centric post-training framework uses contrastive learning on wrong rollouts to improve multimodal reasoning with verifiable rewards.
LLMTM benchmarks LLMs on temporal motif analysis in dynamic graphs for anomaly detection and structural understanding.
Spectral embedding approach for domain-invariant representations via optimal transport plans, addressing distributional shift.
CELM foundation model generates clinical notes from long-duration EEG recordings using multimodal learning.
EDIS analyzes token-level entropy trajectories during LLM generation to diagnose reasoning quality beyond aggregate confidence statistics.