Benchmarking IoT Time-Series AD with Event-Level Augmentations
Benchmarking framework for IoT time-series anomaly detection using event-level augmentations simulating real-world perturbations.
Benchmarking framework for IoT time-series anomaly detection using event-level augmentations simulating real-world perturbations.
Introduces Soft Sequence Policy Optimization, advancing LLM alignment methods beyond GRPO with sequence-level importance sampling.
Theoretical analysis connecting random network distillation, deep ensembles, and Bayesian inference for uncertainty quantification in deep learning.
Shows Pass@k optimization can degrade Pass@1 due to prompt interference, revealing trade-offs in LLM fine-tuning for code generation.
AngelSlim comprehensive toolkit for large model compression combining quantization, pruning, distillation, and speculative decoding.
Muon+ optimizer improves LLM pre-training by adding normalization step after gradient orthogonalization.
Analyzes hardness of maximum likelihood learning for Determinantal Point Processes used in subset selection.
Theoretical bounds on approximation error for ReLU networks approximating low-regularity bounded functions.
Studies differential privacy of quantum and quantum-inspired classical recommendation algorithms.
Unbiased Sliced Wasserstein kernel addresses exposure bias in audio captioning systems to improve caption quality.
SODAs algorithm for data-driven discovery of differential and algebraic equations from sparse optimization approach.
Adaptive physics-inspired design for machine-learning interatomic potentials using Fisher information matrix guidance.
Knowledge Fusion via SkillPacks enables efficient cross-capability transfer between LLMs for multi-task integration and model compression.
LinGuinE PyTorch framework for longitudinal volumetric tumor segmentation combining image registration and guided segmentation.
Theoretical analysis of nonlinear attention mechanisms compared to linear regression for understanding interpolation error.
Studies exchangeability versus i.i.d assumptions for handling distribution shifts when pooling medical imaging datasets.
LayerT2V framework for text-to-video generation producing editable layered representations for professional workflows.
PoET-2 multimodal retrieval-augmented foundation model for understanding protein function using protein language models.
Adaptive hybrid caching optimization for efficient inference in video diffusion transformer models to reduce computational cost.
Geometric autoencoders for Bayesian inference to recover physical system information from limited noisy observations.
Studies dyslexia-like behavior in vision-language models by reducing visual word form area activity to test reading impairment mechanisms.
ConflictScope pipeline automatically evaluates how LLMs prioritize different values when facing conflicting objectives.
Face anonymization using diffusion models to protect identity while preserving image quality and enabling authorized recovery.
Atlas-free Transformer approach for brain network analysis without standardized anatomical templates.
Deep learning framework for optimal stopping problems using martingale representation for high-dimensional financial hedging.
Random search algorithms for vine copula structure learning in multivariate dependence modeling.
Supervised Reinforcement Learning method for training small open-source LLMs on multi-step reasoning tasks, combining SFT and RLVR approaches.
Method for efficient LLM inference via speculative decoding with dynamic tree construction accounting for system variables.
Temporal Sparse Autoencoders using sequential language structure for discovering interpretable features in LLM representations.
Analysis of energy efficiency of small LLMs on local hardware accelerators versus cloud inference for practical deployment.
VLM-Pruner: Token pruning method for vision-language models addressing spatial sparsity and inter-token redundancy.
One-step diffusion samplers using self-distillation and deterministic flow for efficient sampling from unnormalized distributions.
Intelligent agent system for automatically reproducing deep learning bugs by leveraging nondeterminism and environment coupling.
LeanCat: Benchmark of 100 formalized category theory tasks in Lean evaluating LLMs on library-grounded abstraction and theorem proving.
Physics-based noise synthesis framework for astronomical imaging denoising using learning-based methods.
Framework connecting GFlowNets to Markov chain reversibility to control exploration-exploitation trade-off during training.
Method for disentangling style and content in human motion using residual quantized representations for motion style transfer.
Study of multi-query synthesis for dense retriever training showing quality-diversity trade-off benefits out-of-domain and multi-hop retrieval.
Evaluation showing GPT-4o lacks causal models of mental states required for true Theory of Mind despite benchmark performance.
Agent with internet access performs at-scale deanonymization of Hacker News and interview participants using LLMs.
LLM4Cov: Offline agent-learning framework using execution feedback from hardware simulators for test generation with high coverage.
Theoretical analysis of how low-precision quantization affects model and data capacities in high-dimensional linear regression.
Framework for improving graph neural network efficiency using continuous differential equations to avoid prohibitive tensor operations.
Method to decompose epistemic uncertainty in Bayesian deep learning by per-class contributions for asymmetric cost classification tasks.
Research on LLM capacity to be persuaded and detect manipulation, testing vigilance and persuasion in high-stakes decision-making contexts.
Research paper on sparse weight editing for multilingual LLM safety alignment across low-resource languages without expensive retraining.
UX improvement for coverage dashboard: added pre-flight credit check before PR creation to prevent wasted requests.
Agentplace tool for building and automating AI agents at scale. Addresses challenges in agent development and scaffolding.
Microsoft Copilot Tasks uses AI agent to autonomously complete user tasks like converting emails to slideshows.
Remote shell tool for Claude Code and similar applications. Content primarily documentation references.