MF-QAT: Multi-Format Quantization-Aware Training for Elastic Inference
arXiv: Multi-format quantization-aware training enables single model robustness across multiple numeric precisions for elastic inference.
arXiv: Multi-format quantization-aware training enables single model robustness across multiple numeric precisions for elastic inference.
arXiv: Multi-task representation learning in linear bandits with shared latent representations for knowledge transfer.
arXiv: Test-time adaptation for LLMs under continual distribution shift and open-set tasks, preserving source knowledge.
arXiv: HabitatAgent multi-agent LLM system for housing consultation with transparent reasoning and factuality guarantees.
arXiv: Study on how representation choice affects interpretation of protein conformational dynamics from molecular dynamics simulations.
arXiv: Sparse Identification Graph Neural Network for discovering interpretable governing equations in ultra-large complex systems.
arXiv survey: On-policy distillation transfers reasoning from frontier LLMs to smaller models, addressing exposure bias in knowledge distillation.
arXiv: Prompt-based online continual learning for next activity prediction in dynamic processes using catastrophic forgetting mitigation.
arXiv: Variational Neural Stochastic Differential Equations model complex socioeconomic time-series data with heterogeneous dynamics.
arXiv: Full-gradient successor feature representations improve convergence guarantees for transfer learning in RL with non-linear function approximation.
arXiv: Empirical comparison of neural operator surrogates including Fourier neural operators vs polynomial methods for parametric PDEs.
arXiv: Group Relative Policy Optimization for RL addresses advantage collapse in reinforcement learning with verifiable rewards using hints.
Analysis of silent data corruption during LLM training on hardware, studying gradient corruption impacts and detection mechanisms.
Spectral Compact Training method reduces LLM training memory footprint by replacing dense weight matrices with truncated SVD factors.
Transformer-based model with biomarkers for immunotherapy response prediction, improving generalization across diverse cancer datasets.
Open-ended narrative framework for wearable human activity recognition using compositional, unscripted activities instead of closed-set classification.
ThoughtSteer backdoor attack exploiting continuous reasoning in language models that operate silently in hidden states without token output.
Method to reduce neural network multi-class classification complexity from O(n) to O(1) by leveraging known latent space geometry properties.
Optimus training library for pretraining mixture-of-experts LLMs at exascale on Aurora supercomputer, demonstrating 1000s GPU tile scaling.
Deep learning method for plant phenology prediction using domain adaptation to improve climate change forecasting in ecological systems.
Experimental evaluation of Free-Market Algorithm orchestrated Mixture-of-Experts with cost-penalized fitness for domain adaptation.
Optimal decomposition technique for low-rank approximation of LLM weights enabling efficient fine-tuning and inference.
Method for language agents to optimize test-time adaptation policies through iterative refinement during inference.
Reinforcement learning approach with verification for iteratively improving LLM policies based on actual performance gains.
Framework for human-AI cooperation that models fatigue-induced performance degradation in learning-to-defer systems.
Compositional embedding method for protein networks using additive sequence models on biological interaction data.
Orthogonal learning approach for estimating heterogeneous long-term treatment effects combining experiments and observational data.
Method for verifiable repair of transformer vulnerabilities to adversarial perturbations with inner-layer guarantees.
Flow-based reinforcement learning policy with distributional approach for capturing multimodal solutions in trajectory optimization.
Graph partitioning technique using embeddings to enable scalable distributed training of graph neural networks.
Transfer learning methodologies for Bayesian network structure learning with scarce data.
Model-based learning approach for finite-window policies in partially observable Markov decision processes.
Method for efficiently evaluating LLM downstream performance during training without expensive full inference.
Algorithmic approach to multi-objective optimization via hashing and randomization for identifying Pareto frontiers.
Theoretical analysis of dependency networks using information geometry perspective for modeling complex systems.
Data-driven sports training framework using skeleton-based biomechanical analysis and motion modeling for dart throwing.
AI pipeline extracting building elevation data from street-view imagery with ML imputation for flood risk assessment.
Analysis showing how irrelevant context degrades LLM reasoning performance despite test-time scaling capabilities.
Generative model approach using adversarial distribution alignment to bridge simulation-to-experiment gap in scientific domains.
ORCA framework calibrating LLM sampling through conformal prediction to improve test-time reasoning efficiency and generalization.
Physics-informed neural network combining diffusion-advection with evidential fusion for air quality forecasting.
Multiscreen mechanism for language models enabling absolute query-key relevance assessment beyond relative attention redistribution.
CliffSearch agent framework for scientific algorithm discovery combining LLM-guided search with structured evolution of theory and code.
Mathematical framework analyzing what determines forecast skill in AI weather prediction, emphasizing training methodology over architecture.
LAPIS-SHRED method for reconstructing spatio-temporal dynamics from sparse observations using shallow recurrent decoders.
PhoneticXEUS model for robust multilingual phone recognition trained on large-scale data with pretrained representations.
LLM-based recruitment tool identifying requisition-specific competencies through dynamic few-shot prompting and reflection.
Text-based harmonization approach using LLMs to unify multi-institutional EHR data without explicit schema standardization.
Modular RL framework with decomposable reward modeling and realistic environment design for Forex trading applications.
Mathematical analysis establishing isomorphism between ant colony behavior and ensemble learning methods like boosting.