A Survey on Federated Fine-tuning of Large Language Models
Comprehensive survey of Federated Learning combined with LLM fine-tuning (FedLLM), covering privacy-preserving collaborative model adaptation methods.
Comprehensive survey of Federated Learning combined with LLM fine-tuning (FedLLM), covering privacy-preserving collaborative model adaptation methods.
Survey of trustworthy GUI agents built on LLMs, identifying execution gap challenges in real-world digital environment automation with irreversible actions.
CONTINA provides confidence intervals for traffic demand prediction with coverage guarantees for traffic operations planning.
Analysis of selective State-Space Models stability properties under discontinuous gating using passivity and Input-to-State Stability frameworks.
RefLoRA improves LoRA fine-tuning of large models by identifying optimal low-rank factorizations to address convergence and performance degradation issues.
Analysis of performance asymmetry in Model-Based RL agents on Atari100k, showing dramatic variance across task types despite high average performance.
Framework for robust multivariate time series forecasting addressing channel dependencies, asynchronous sampling, and missing data simultaneously.
Wasserstein Barycenter Soft Actor-Critic algorithm improves sample efficiency in off-policy reinforcement learning via directed exploration.
CausalFM framework trains Prior-Data Fitted Networks as foundation models for causal inference via in-context learning on tabular data.
Frequency-domain occlusion method for interpreting time series neural networks, benchmarking frequency-based attribution approaches.
Efficient fine-tuning method for LLMs using entropy-based complexity detection to apply chain-of-thought reasoning selectively on difficult examples.
Theoretical analysis of transfer learning in infinitely wide neural networks under gradient flow, quantifying pretraining benefits.
Behavior-based User Segmentation proposes tree-based data structure for hierarchical user representation in recommendation systems.
One-Step Flow Q-Learning accelerates Diffusion Q-Learning for offline reinforcement learning by enabling single-step denoising without auxiliary modules.
Uncertainty Propagation Networks extend neural ODEs to model both state trajectories and uncertainty quantification in continuous-time systems.
Method for heart rate prediction from heterogeneous health device data using unified representations for personalized monitoring.
MCTD-ME combines masked diffusion models with Monte Carlo Tree Search for protein design, addressing long-range dependencies and search space challenges.
Probabilistic Scenarios paradigm for time series forecasting generates finite scenario sets instead of samples to address computational and coverage limitations.
RHYTHM framework uses LLMs as spatio-temporal predictors with hierarchical temporal tokenization for human mobility prediction.
Polychromic objectives framework for reinforcement learning fine-tuning preserves policy diversity during RLFT to prevent mode collapse.
Recursive Self-Aggregation (RSA) test-time scaling method combines parallel and sequential inference to improve LLM reasoning capabilities.
Cautious Weight Decay (CWD) optimizer modification applies weight decay only to parameters aligned with optimizer updates.
TeamFormer proposes shallow parallel Transformer architecture with progressive approximation for efficient training and inference.
Latent-Augmented Discrete Diffusion (LADD) improves discrete diffusion models for fast language generation by modeling cross-token dependencies.
Comparative study of machine learning models (LASSO, random forest, XGBoost, neural network) for liver disease prediction.
Framework for scalable AI oversight by partitioning complex multi-domain evaluation tasks among domain-specific human experts.
ContextPilot accelerates long-context LLM inference by enabling context reuse via KV-cache optimization for RAG and agent memory layers.
Interpretable machine learning applied to urinary metabolomics data for ADHD biomarker discovery.
Physics-encoded inverse modeling framework for Arctic snow depth prediction combining sequential architecture with domain knowledge.
Framework for evaluating anomaly detection in 5G networks accounting for non-IID data and adaptive attackers.
LORE framework learns intrinsic dimensionality and ordinal embeddings from triplet comparisons for subjective perceptual spaces.
Research on scaling laws for LLM pretraining with different optimizers beyond AdamW, examining new optimizers like Muon, Shampoo, and SOAP.
Research on optimization algorithms for LLM reinforcement learning, comparing SGD vs Adam optimizers and their effectiveness in RL training phases.
AceGRPO method for autonomous ML engineering agents using adaptive curriculum and group relative policy optimization to overcome behavioral stagnation.
VESPO algorithm for stable off-policy LLM training via importance sampling with variance reduction to prevent policy divergence and collapse.
Vector quantization compression for MoE LLMs using KLT-guided SVD and bias-corrected quantization for ultra-low-bit model deployment.
Multi-tenant ML serving system handling seamless model updates while maintaining decision thresholds across clients with distribution shifts.
Pawsterior framework for simulation-based inference using variational flow matching with structured domain constraints for bounded parameters.
Analysis of worker-level optimization misalignment in data-parallel LLM fine-tuning despite parameter synchronization, termed silent inconsistency.
Numerical analysis showing GLU variants scale asymptotically faster than MLPs, explaining architectural dominance in frontier LLMs.
Study of multilingual data curation across 13 languages identifying interference patterns and optimal training strategies for 20-trillion-token dataset.
GLM-5 foundation model transitioning from vibe coding to agentic engineering with DSA cost reduction and async RL infrastructure for improved autonomy.
Study of representation collapse during neural network training across five model scales showing scale-invariant emergence patterns in 119 task combinations.
AI-CARE metric for evaluating ML models on carbon emissions and energy consumption alongside standard performance metrics.
Research on interpretable Graph Neural Networks using symbolic methods to overcome message-passing limitations and Weisfeiler-Lehman expressivity barriers.
Neuro-symbolic framework (NSGGM) for molecule and graph generation combining neural proposals with symbolic guarantees for controllable generation.
MPZCH indexing mechanism for large-scale recommendation systems to mitigate embedding collisions and improve model freshness in embedding tables.
MASPO algorithm for LLM reasoning via reinforcement learning, addressing gradient utilization, probability mass, and signal reliability in trust region mechanisms.
Theoretical framework for training modular LLMs by combining domain-specific experts without heuristic dataset weighting, matching monolithic model performance.
Framework for multi-round human-AI collaboration ensuring AI complements rather than undermines human decision-making via counterfactual harm and complementarity principles.