WOMBET: World Model-based Experience Transfer for Robust and Sample-efficient Reinforcement Learning
Framework for generating and leveraging prior data through world models for sample-efficient offline-to-online reinforcement learning in robotics.
Framework for generating and leveraging prior data through world models for sample-efficient offline-to-online reinforcement learning in robotics.
Characterizes necessary and sufficient conditions for reward poisoning attacks in linear MDPs, providing theoretical framework for attack feasibility.
Proposes dual formulation for robust reinforcement learning under dynamics uncertainty, addressing limitations of domain randomization and adversarial RL methods.
Theoretical analysis of KL divergence stability under Gaussian perturbations for non-Gaussian distributions, with applications to OOD detection in generative models.
C-Flat optimization for continual learning on task streams avoiding forgetting with reduced computational overhead compared to prior approaches.
THEIA: modular neural architecture learning complete Kleene three-valued logic end-to-end across mathematical domains with compositional generalization.
Bayesian-ARGOS: principled method for discovering equations governing complex systems from noisy observations using sparse regression.
Systematic investigation of on-policy distillation dynamics in LLM post-training, identifying conditions for success and failure mechanisms.
Characterizes convex hulls of reachable sets for nonlinear systems with bounded disturbances and uncertain initial conditions.
Chatbot using NLP and deep learning to answer FAQs in Amharic language for university students, addressing common administrative questions.
Sparse online learning algorithm for Koopman operator with stochastic approximation and convergence guarantees for nonlinear dynamical systems.
Fast training method for physics-informed neural networks solving PDEs without gradient descent, addressing optimization and temporal causality.
AudioX: unified multimodal framework for anything-to-audio generation integrating text, video, and audio signals for flexible audio synthesis.
Learning-augmented algorithms for densest subgraph problem using ML classifier predictions to achieve linear-time approximation.
PO-Flow: continuous normalizing flow framework for causal inference modeling potential outcomes and counterfactuals from observational data.
VS2 method for unsupervised adaptation of vision foundation models using sparse autoencoders for steering vectors without weight updates or labels.
Geminet: lightweight ML-based traffic engineering framework using duality-based iterative process that handles topology changes with scalability.
Survey of synthetic network traffic generation methods from statistical models to deep learning for data-driven networking applications.
Neural stochastic optimization method using deep networks to solve two-stage unit commitment problems under high-dimensional uncertainty.
Proposes unified evaluation framework for assessing forecasting capabilities of frozen vision models across diverse tasks and abstraction levels.
AutoMAT framework combines simulation, ML, and experiments for autonomous alloy discovery across competing objectives with data-efficient workflow.
RL-PLUS method addresses capability boundary collapse in LLMs using reinforcement learning with hybrid-policy optimization to improve reasoning abilities beyond base model limits.
Investigates Pac-Man adversarial attack on random walk algorithms in distributed systems, analyzing vulnerability of decentralized learning to malicious nodes.
Memp framework endowing LLM agents with learnable, updatable procedural memory. Distills agent trajectories into fine-grained instructions and script-like abstractions.
Deep learning approach for choroidal nevi lesion segmentation in fundus images. Addresses diagnostic challenges in ophthalmology with AI-based image analysis.
Mathematical framework applying Möbius inversion and Shapley values to characterize higher-order structure in weighted directed acyclic multigraphs.
Latent-space steering method to reduce code-switching in multilingual LLMs. Uses PCA on parallel translations to control language identity at inference time.
Diffusion language models with adaptive acceleration for code generation. Proposes Saber to balance inference speed and output quality with sampling optimization.
RL and vision-language models for long-horizon deformable object routing tasks in robotic assembly. Addresses planning and skill execution for cable/rope manipulation.
ZK-APEX system enables verifiable personalized machine unlearning on edge devices using zero-knowledge proofs for compliance verification.
TRIM framework routes only critical reasoning steps to capable models in multi-step reasoning tasks, reducing cascading failures in LLM applications.
Property-preserving kernel operator learning method for incompressible flow simulations respecting physical constraints.
Theoretical analysis of mini-batch gradient noise in SGD as sampling design problem with oracle complexity implications.
LoRA-MME ensemble architecture using parameter-efficient fine-tuning of transformer encoders for multi-label code comment classification.
Argument for quantum computers being naturally suited for spectral machine learning methods that manipulate Fourier spectra.
Machine learning framework for DC arc-fault detection in photovoltaic systems using lightweight, transferable, self-adaptive models.
Systematic evaluation of LLM formal reasoning capabilities using Chomsky hierarchy and computation theory benchmarks for automated software engineering.
Machine learning models for immunotherapy response prediction show limited generalization across patient cohorts in cancer treatment.
STEP-HRL hierarchical reinforcement learning framework reduces computational cost of LLM agents by learning from single-step transitions instead of long histories.
T-STAR framework applies tree-structured reinforcement learning to improve multi-turn LLM agent policy optimization by identifying critical reasoning steps.
Parameter-free extragradient algorithms for monotone variational inequalities with improved stepsize selection and non-ergodic convergence.
Pre-registered evidence showing AI safety measures can produce iatrogenic harm in medical LLM outputs depending on prompt phrasing.
Novel Dynamic Assembly Forest model detects diffusion-generated images using traditional machine learning instead of deep neural networks.
Machine learning approach for learning cost-optimal sequential decision policies in clinical settings with informative missingness using doubly robust Q-learning.
LangFlow demonstrates continuous diffusion language models can match discrete counterparts by connecting embeddings and diffusion processes for language generation.
Spatial Atlas introduces compute-grounded reasoning for spatial-aware research agents, handling multimodal benchmarks through deterministic computation before LLM generation.
Empirical evaluation of differential privacy defenses against membership inference attacks in federated learning using NIST genomics challenge data.
Security vulnerability in Windows 11/10 where Windows Defender's cloud tagging causes file rewriting, enabling privilege escalation.
Essay arguing local LLM infrastructure doesn't require Ollama tooling.
Autonomous RL agent integrated with BDD framework for dynamic web UI testing, generating test scenarios aligned with business expectations.