Graph-Dependent Regret Bounds in Multi-Armed Bandits with Interference
Multi-armed bandit algorithm using local graph structure to minimize regret under network interference.
Multi-armed bandit algorithm using local graph structure to minimize regret under network interference.
Compositional automata learning technique for inferring models of concurrent systems through alphabet refinement.
SetONet neural operator for solving PDEs with variable sensor layouts by treating inputs as unordered sets.
Neural framework for learning conditional optimal transport maps using hypernetworks to generate adaptive transport parameters.
JUSSA framework uses steering vectors to improve LLM-as-a-judge reliability, detecting and mitigating sycophancy through honesty-promoting alternatives.
Binned semiparametric Bayesian networks for efficient kernel density estimation using data binning to reduce computational cost.
Double-Diffusion integrates ODE-prior with denoising diffusion models for spatio-temporal graph forecasting, balancing deterministic and stochastic components.
Klear-Reasoner model with long reasoning capabilities using gradient-preserving clipping policy optimization, with detailed training disclosures.
Knowledge component discovery in programming using representation learning on student code for personalized instruction systems.
Thompson sampling analysis for Sharpe ratio optimization in multi-armed bandit setting, addressing fractional objective with dependent mean-variance.
LSTM-based machine learning calibrator for agent-based epidemic models, learning inverse mapping from time series to SIR parameters.
EEG classification study comparing neural network architectures and optimizers across brain hemisphere frequency bands using TensorFlow/PyTorch.
Comprehensive survey of intrinsic dimension estimators under manifold hypothesis, reviewing theoretical foundations and comparing eight methods.
Analysis of weight constraints in linear smoothers for causal inference, balancing feature imbalance against parametric modeling assumptions.
Polychromic objectives approach to prevent mode collapse in reinforcement learning fine-tuning, preserving policy diversity during exploration.
Convergence analysis for decentralized SGD with high-probability guarantees, removing restrictive assumptions on gradient bounds and noise.
Mathematical analysis of incoherence in goal-conditioned autoregressive models, studying policy improvement through fine-tuning with online RL.
Empirical study examining the R-Learner framework limitations for network causal inference with graph-dependent heterogeneous treatment effects.
Theoretical analysis of diffusion models on discrete state spaces, establishing convergence guarantees for masked and random walk dynamics.
Tomographic Quantile Forests (TQF) for nonparametric uncertainty quantification in multivariate regression tasks.
Meta-probabilistic modeling framework for discovering latent structure across collections of related datasets using probabilistic graphical models.
Research on learnable Gray-Wyner networks for disentangling common and task-specific information in computer vision.
SAU method for machine unlearning in sparse LLMs via gradient masking and importance redistribution for privacy.
Research showing activation steering vectors in LLMs are fundamentally non-identifiable with large equivalence classes.
FIRE method for reinitialization in continual learning that balances stability and plasticity in neural networks.
Research on Natural Hypergradient Descent for bilevel optimization using Fisher information matrix as Hessian surrogate.
Evaluation of scaling laws for Chemical Language Models on downstream molecular property prediction tasks.
Introduces adaptive backbone scaling framework for class incremental learning to balance plasticity and stability while reducing memory overhead.
Framework for learning interpretable nonlocal operator kernels from data for climate process modeling.
Standardized benchmark dataset for epitope-specific antibody design with unified evaluation metrics for generative methods.
Preconditioned optimization method using row-momentum normalization for scalable matrix-based neural network training.
Projection-free algorithm for contextual bandits achieving logarithmic regret with improved efficiency over Online Newton Step.
Skill routing system for LLM agents that identifies relevant skills from large ecosystems before planning or execution.
Framework using LLMs to automatically design reward programs for cooperative multi-agent RL systems with sparse task feedback.
DreamerAD uses latent world models for efficient RL in autonomous driving, compressing diffusion sampling 80x with visual interpretability.
ERL framework enabling LLM agents to self-improve through experiential learning from past interactions and reflective adaptation.
Neuro-symbolic method for process anomaly detection combining neural networks with domain knowledge from process mining.
Federated learning approach for livestock growth prediction addressing privacy concerns and limited datasets in agricultural applications.
Hierarchical indexing system for efficient fine-grained sparse attention in transformers, removing bottleneck from key selection.
Autoregressive system for generating complete analog circuit designs with topology and component values using graph VAE and flow-matching.
Neural operator learning nonlinear PDEs by lifting dynamics into linear latent space via Koopman generator decomposition.
Uses 2-datapoint reduced density matrix from quantum chemistry to predict and understand phase transitions during neural network training.
Continual learning framework using hierarchical exploration-exploitation to acquire knowledge from task streams without catastrophic forgetting.
Combines MCMC correction with score-based diffusion models using Metropolis-Hastings steps for improved sampling in model composition.
Novel k-means clustering approach incorporating causal inference to identify heterogeneous treatment effects across unknown subgroups.
Method for estimating intrinsic dimensionality of datasets accounting for scale-dependent effects and measurement noise in unsupervised learning.
LLM-based approach for unsupervised code correctness evaluation that separates code comprehension from auditing to improve accuracy without reference implementations.
Method for learning stochastic differential equations from temporal snapshots without observable trajectories, applied to gene networks and financial markets.
Project management framework using GenAI agents to optimize team composition by matching personality roles.
Trust-region stochastic SQP algorithm for nonlinear optimization with complexity bounds under heavy-tailed noise.