Diffusion Modulation via Environment Mechanism Modeling for Planning
Diffusion-based planning method for offline reinforcement learning with environment mechanism modeling to ensure trajectory consistency.
Diffusion-based planning method for offline reinforcement learning with environment mechanism modeling to ensure trajectory consistency.
Heterogeneity-aware client selection for federated learning to improve model accuracy across statistical heterogeneous clients.
Prior-agnostic incentive-compatible exploration in bandit settings with multiple sequential agents.
Physics-guided HyperGraph Transformer for signal extraction from particle collision data at CERN's HL-LHC.
ActionEngine: training-free framework for GUI agents using state machine memory to reduce costs, latency, and improve accuracy vs reactive vision-language model approaches.
Inner speech guides steerable imitation learning for human-AI coordination, capturing behavioral diversity and non-Markovian human behaviors.
STAR-LDM integrates latent diffusion planning with autoregressive generation, enabling semantic planning before token commitment.
Theoretical proof that standard Transformers achieve minimax approximation rate for Hölder functions in nonparametric regression.
Personal information memorization in language models: detector suite for email, phone, IP addresses outperforms existing regex baselines.
Learning theory characterizing online and private learnability under distributional constraints via generalized smoothness.
Bayesian inference methods for analyzing high-resolution actigraph wearable device health and mobility data.
Interpretable open-world object detection framework using concept decomposition to distinguish known from unknown objects.
Theoretical analysis of SGD convergence with perturbations in forward-backward passes through sequential operators.
DANCE method for conformal prediction uncertainty quantification using adaptive neighborhood estimation with pre-trained deep learning models.
Vision-language models applied to ergonomic assessment by estimating hand distances from RGB video for NIOSH lifting equation.
F10.7 solar index forecasting using wavelet decomposition and iTransformer model with sunspot number features.
Communication-inspired discrete image tokenizer for vision transformers optimized for semantic structure over texture reconstruction.
SibylSense enables adaptive reward rubric learning for open-ended generation via memory tuning and adversarial probing to prevent reward hacking.
Speech spoofing detection system analysis examining impact of speaker identity on detection embeddings and robustness.
Tail-aware divergence for language model distillation that decouples top-K probabilities to improve knowledge transfer from teacher to student models.
DRESS framework addresses computational complexity of higher-order Weisfeiler-Lehman graph analysis using continuous dynamical systems.
Functional Continuous Decomposition framework for non-stationary time-series analysis with parametric optimization and guaranteed continuity.
SpatiaLQA benchmark evaluates spatial logical reasoning capabilities in vision-language models for real-world scenarios.
Economic analysis of AGI impact on labor, marginal costs, and human verification as binding constraint on growth.
MRI brain lesion segmentation using report-supervised learning with multi-parametric scans and substructures.
PyTorch framework for medical image processing with support for volumetric data and domain-specific training procedures.
Empirical analysis of conditional independence tests for causal discovery, addressing failures in small samples and FDR control.
Studies multi-distribution learning complexity with bounded label noise to determine if single-task 1/ε rates extend to multi-source settings.
Improves sequential attention for recommendation systems by integrating position information into attention mechanism beyond additive embeddings.
Proposes dual-model training framework inspired by neuroscience motivation states with alternating base and larger model activation.
Extends projection pursuit tree classifier with visual diagnostic methods for high-dimensional multi-class classification problems.
Analyzes complexity of classical acceleration (FISTA) for computing ℓ1-regularized PageRank with degree-weighted work bounds.
Applies vision-language models to radiology imaging for decision support with longitudinal multi-modal chest X-ray analysis.
DEEPSYNTH benchmark evaluates LLM-based agents on complex multi-source information synthesis tasks beyond fact retrieval.
Implements tensor parallelism for selective state-space models on multi-GPU systems to scale SSM inference beyond single GPU limits.
Decomposes epistemic uncertainty in Bayesian deep learning into per-class contributions for safety-critical classification asymmetric costs.
Aletheia, a mathematics research AI agent powered by Gemini 3 Deep Think, autonomously solves 6 of 10 FirstProof challenge problems.
Combines off-policy and on-policy reinforcement learning for fast visual sim-to-real robotics training with minimal sample waste.
Proposes coalition-based partitioning approach for Shapley value attribution methods in explainable AI to resolve attribution conflicts.
Theoretical framework for disentangled representation learning when factors of variation are dependent rather than independent.
Proposes dynamic optimal transport minimization for discrete flow matching on categorical data with Kantorovich formulation.
Machine learning method for predicting subway passenger flows during incidents using causality-based two-stage approach.
Measurement-driven analysis of LLM inference energy and performance tradeoffs across workloads using GPU DVFS on 1B-32B parameter models.
Theoretical investigation of benign overfitting phenomenon in binary linear classification across over-parametrized neural networks.
Applies Deep Deterministic Policy Gradient (DDPG) reinforcement learning to engine control in a safety-critical testbench environment.
Theoretical analysis of neural network-based optimal transport solvers using semi-dual adversarial formulations for generative modeling.
Theoretical analysis of differentially private shuffled gradient methods for convex ERM, addressing privacy-accuracy tradeoffs compared to standard DP-SGD.
Analyzes flaws in Integrated Gradients attribution method and proposes alternative path-based approach using model-induced geometry for better feature importance explanations.
Proposes Cauchy-Schwarz divergence for vision-language alignment to address distributional differences and alignment-uniformity conflicts in multimodal models.
Semantic Parallelism optimizes MoE LLM inference by co-scheduling model device placement and request routing, reducing communication overhead in multi-device serving.