When RL Meets Adaptive Speculative Training: A Unified Training-Serving System
Unified training-serving system combining RL with adaptive speculative decoding for accelerated LLM inference.
Unified training-serving system combining RL with adaptive speculative decoding for accelerated LLM inference.
Infusion: Framework using influence functions to craft training data perturbations that induce targeted model behavior changes.
Uncertainty quantification for machine learning interatomic potentials using evidential deep learning.
Geometric analysis of optimization dynamics in transformers trained on modular arithmetic revealing low-dimensional subspaces.
Study on early-warning signals of grokking via loss-landscape geometry on SCAN and Dyck-1 benchmarks.
CeRA: Parameter-efficient fine-tuning method extending LoRA with non-linear capacity expansion via gating and dropout.
Physics-informed neural operators for solving PDEs with improved generalization beyond training distributions.
SafeSci: Framework for evaluating safety of large language models in scientific domains with comprehensive benchmarks.
CRISP: Method for teaching LLMs to reason more concisely via self-distillation with 'be concise' conditioning.
Stock market prediction using Node Transformer and BERT sentiment analysis for financial forecasting.
WinDiNet uses pretrained video diffusion model as differentiable physics simulator for urban wind flow prediction, replacing expensive CFD simulations.
λ-GELU parameterized gating function enabling controlled ReLU conversion while maintaining smooth activation properties for deployment.
ERPO method for token-level credit assignment in LLM reasoning models, addressing entropy collapse in GRPO through information heterogeneity.
Recurrent network training without Jacobian propagation using hidden state temporal credit. Studies gradient normalization and online adaptation.
Mathematical framework explaining phase transitions in neural network training via spectral gap of parameter update Gram matrices. Grokking and capability gains analysis.
Transfer learning for nonparametric Bayesian networks under scarce data. Proposes PC-stable-transfer and hill climbing transfer learning methods.
Tutorial on Bayesian Optimization for automating scientific discovery using surrogate models and probability-driven frameworks.
annbatch: mini-batch loader for terabyte-scale biological data in AnnData format, addressing memory bottlenecks in ML training on large datasets.
arXiv paper developing asymptotic theory for quantile estimation via stochastic gradient descent with constant learning rate.
arXiv paper introducing MAPP mechanism for efficient data marketplace pricing using learned value distributions.
arXiv paper on gen2seg: using generative models (Stable Diffusion, MAE) for category-agnostic instance segmentation.
arXiv paper proposing LMask, a learning framework using dynamic masking for constrained routing problems optimization.
arXiv paper comparing deep learning neural networks against statistical methods for solving ODE inverse problems.
arXiv paper analyzing tokenized U.S. Treasuries transactions on blockchain infrastructure.
arXiv paper on constrained free energy minimization for quantum thermodynamic system design.
arXiv paper analyzing 150+ years of German parliamentary migration debates using LLMs, revealing shift from post-war solidarity to anti-solidarity.
arXiv paper on ROPA: synthetic robot pose generation for RGB-D bimanual data augmentation to improve imitation learning policies.
Algorithm for column subset selection using adaptive randomized pivoting with connections to volume sampling.
Forecasting data movement patterns in MoE LLM inference to reduce bottlenecks in multi-unit serving systems.
Fast regret bounds for contextual bandits without realizability assumptions using pessimistic policy updates.
Seer: context learning RL system for fast synchronous LLM training, addressing rollout latency and resource utilization.
Investigation of test overfitting in SWE-bench for code resolution, where models pass tests but miss important cases.
Autoregressive video generation using reward feedback to improve performance without strong teacher models.
GoogleFontsBench: benchmark for font classification using parameter-efficient fine-tuning of DINOv2 vision model.
Analysis of stochastic gradient descent convergence under exchangeable mini-batch sampling and Fisher information.
Adaptive guidance method for retrieval-augmented masked diffusion models to handle noisy retrieved context.
Neural network approach for inference in discrete choice models using equivariant architectures.
Privacy-accuracy trade-offs in sparse linear regression under differential privacy mechanisms.
Stage-level analysis of prompt injection attacks across five LLM agents, tracking defenses through kill-chain stages.
Geometric optimization framework using affine normal descent for smooth unconstrained optimization.
Multimodal LLMs struggle with spatial consistency reasoning across multiple 3D scene views.
Analysis of reliability and risk in AI-assisted medication decision systems in healthcare workflows.
ProdCodeBench: benchmark for evaluating AI coding agents using real developer-agent sessions and production workloads.
Study of how language pretraining biases transfer to vision tasks, addressing cross-modality adaptation challenges.
Extended research on learning state machines from data streams with PAC-learning bounds and improved heuristics.
Compiler-based approach for skills in LLM agents. Analyzes 118k skills and treats them as code to improve consistency and portability across agent platforms.
Docracy: Postgres-backed document store for AI agents to create, use, and store context artifacts across tasks instead of filesystems.
Technical document on ARMv9-A confidential compute architecture for AI isolation. Incomplete content with philosophical tangent.
mesh-llm: Block's open-source project creating decentralized AI compute networks by pooling multiple machines for LLM inference.
SSH tool for connecting to machines behind NAT/firewalls without port forwarding. Infrastructure utility unrelated to AI.