MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models
MicroMix: mixed-precision quantization method using microscaling formats for efficient LLM inference on NVIDIA Blackwell hardware.
MicroMix: mixed-precision quantization method using microscaling formats for efficient LLM inference on NVIDIA Blackwell hardware.
Novel fine-tuning mechanism for LLMs that addresses data quality/volume issues through controlled forgetting to improve domain adaptation.
PENGUIN: Transformer variant with periodic-nested group attention mechanism for improved long-term time series forecasting.
Empirical study of initialization schemes for Kolmogorov-Arnold Networks, proposing theory-driven approaches to improve training of spline-based KANs.
Training-free framework for deferring predictions to multiple experts using conformal prediction without retraining.
ReTrack enables data unlearning in diffusion models via importance sampling to remove memorized training data influence.
GaussianPSL framework for multi-objective optimization with soft partitioning handling complex discontinuous and degenerate Pareto frontiers.
Neural network approach to learning modular genetic circuit functions in synthetic biology from input/output data.
Algorithms for distributed RL with policy gradients under asynchronous parallel computation and communication.
Benchmark for LLM-assisted emergency triage from MIMIC-IV-ED database with preprocessing for rapid patient deterioration prediction.
Method for diagnosing when data augmentation and equivariant architectures improve or harm generalization under distribution asymmetry.
Q-learning algorithm for non-stationary RL with distribution shifts under both episodic and infinite-horizon settings.
Uses LLMs to programmatically synthesize anomaly detectors for tabular data without direct processing of raw data for privacy.
ACE framework evolves context for self-improving LLM agents, addressing brevity bias and context collapse in iterative refinement.
Mitigates premature exploitation in particle filtering for inference-time scaling of language models using process reward models.
TabPFN-Wide extends prior-data fitted networks for tabular data with extreme feature counts in biomedicine applications.
Constraints-of-Thought framework enables LLMs to perform constrained multi-step reasoning while satisfying symbolic constraints and user intent.
PANTHER applies generative pretraining to model user behavior sequences beyond language, using multi-dimensional action attributes.
Bandit algorithm for high-stakes sequential decision-making that learns when to abstain from actions with irreparable consequences.
RL algorithm for learning policies that maximize return while inducing dispersed state distributions across multiple reward sources.
SPORE is a classical clustering algorithm handling arbitrary geometry without rigid assumptions on cluster structure.
Transformer-based symbolic regression method for discovering interpretable mathematical expressions from observed data.
Two-stage entropy approach for noise-tolerant multimodal LLM training using reinforcement learning with verifiable rewards.
Object-centric world models for reinforcement learning using decomposed representations to improve sample efficiency in multi-object environments.
UniGame addresses inconsistency in unified multimodal models between understanding and generation through adversarial framework.
SAFLe framework enabling scalable non-linear federated learning in a single round with heterogeneous data distribution invariance.
Domain adaptive retrieval using prototype-based semantic consistency alignment to transfer knowledge from labeled to unlabeled domains.
Hybrid physics and ML approach for crop yield projections combining gridded crop models with machine learning to improve agricultural forecasting.
Research on measuring noise in LLM evaluations using statistical methods to separate signal from noise in prediction, data, and combined noise.
Day-ahead electricity price forecasting combining linear models, neural networks and online learning for volatile market prediction.
Symbolic regression with partial parameter sharing for discovering expressions describing related phenomena with varying parameters.
Hellinger multimodal VAEs using probabilistic opinion pooling to aggregate unimodal inference distributions.
Sparse-RL addresses memory bottleneck in LLM reinforcement learning by reducing KV cache overhead during long-horizon rollouts.
Dual-prototype disentanglement framework for context-aware time series forecasting using dynamic temporal pattern learning.
Generalized framework for adaptive grid allocation in Kolmogorov-Arnold Networks accounting for target function complexity.
Theoretical framework explaining memorization in diffusion models through weighted sum of empirical score functions.
TextBFGS applies case-based reasoning to iterative code generation with LLMs, using past solutions to guide optimization.
Benchmarks Echo State Networks for univariate time series forecasting against traditional statistical methods on M4 dataset.
Domain adaptive diffusion policy for control that generalizes to unseen transition dynamics through domain representation learning.
Analyzes GRPO limitations in exploration and difficulty adaptation for LLM reasoning, proposing improvements to advantage symmetry.
VJE framework for self-supervised learning using reconstruction-free latent variables with symmetric conditional ELBO optimization.
Applies tabular foundation models to knowledge tracing for real-time student learning prediction without extensive offline training.
Interpretable image classification using hierarchical concept embeddings recovered from vision-language models.
φ-DPO addresses fairness in continual learning for multimodal models when training data is imbalanced across tasks.
Reduces transformer KV cache by using low-dimensional keys for attention selection while maintaining full-dimensional values.
Proposes using proper scoring rules to evaluate probabilistic predictions from tabular foundation models instead of point-estimate metrics.
Replaces dense attention projections with Walsh Hadamard Transforms to reduce transformer parameters by 25% while maintaining performance.
Theoretical analysis of multi-armed bandits under memory and batch constraints, studying regret bounds.
Autonomous driving framework using Dirichlet process mixture models and causal adjustment to address catastrophic forgetting and spurious correlations in lifelong learning.
LLM agent framework with causal scratchpad for open-ended scientific discovery through iterative program evolution.