Controllable protein design with particle-based Feynman-Kac steering
Feynman-Kac framework for guiding diffusion-based generative models toward proteins with specified properties and tailored structures.
Feynman-Kac framework for guiding diffusion-based generative models toward proteins with specified properties and tailored structures.
Unified stability analysis comparing SAM and SGD optimization algorithms showing role of data coherence and simplicity bias in generalization.
Comparative analysis of transformer models (DistilBERT) versus psycholinguistic features for detecting business email compromise attacks.
Framework addressing structural overfitting in graph neural networks for missing feature imputation using distribution-aware rectification.
Federated learning approach for vehicle edge caching using personalized distillation to predict user content preferences while preserving privacy.
Random-bridges framework for generative models using stochastic processes conditioned on target distributions for flexible transport between distributions.
Electric load forecasting model integrating multi-source textual data (news, social media, policies) with temporal grid-aware predictions.
Mamba-based neural operator framework for accurate chemical kinetics modeling in combustion simulations using efficient temporal modeling.
Distribution restoration method using noisy samples and optimal transport to recover fully observed data from partial corrupted observations.
Method using large language models to measure semantic similarity in categorical data clustering by bridging gap in attribute distance representation.
Differentiable adversarial framework for task-aware data reduction using learnable selector and minimax optimization to identify informative samples.
LLM-based hardware-aware quantization agent automating model quantization for efficient LLM deployment on resource-constrained hardware.
Theoretical analysis of implicit bias in stochastic learning using geometric perspective to explain solution selection in overparameterized models.
Split learning optimization method reducing memory overhead for LLM training on edge devices using hybrid-order optimization instead of first-order approaches.
Neural memory storage architecture for LLMs with invertible compression and learnable prediction for runtime memory.
Information-theoretic approach for designing shared visual tokenizers in unified multimodal LLMs.
Safety alignment framework addressing unique challenges of sparse routing in Mixture-of-Experts language models.
Training framework for multimodal systems to maintain performance when input channels are lost at deployment.
Pruning framework for physics-informed neural networks to improve robustness to noise in PDE inverse problems.
Time series imputation method using channel-head binding for handling diverse missing patterns.
Training method for LLMs to directly model generative reasoning process in scientific discovery applications.
Deep learning framework for diagnosing obstructive sleep apnea from oximetry data with clinical knowledge integration.
Multi-agent reinforcement learning algorithm for general-sum games with convergence guarantees in heterogeneous agent settings.
Optimization algorithm extending exponential moving average with adaptive rates and zero-noise optimality guarantees.
Formal grammar framework preventing data leakage in ML workflows through structural constraints and assessment gates.
Reinforcement learning method for post-training reasoning models using hindsight feedback in sparse reward environments.
Security considerations and recommendations for AI agents from Perplexity based on operating agentic systems in production environments.
Framework for measuring LLM robustness to prompt variations, typos, and alternative phrasings in real-world inputs.
Predictive maintenance framework for connected vehicles integrating sensor and environmental data with ML models.
Distributed learning algorithm combining Byzantine robustness with communication compression for collaborative ML systems.
Research on sparse Mixture-of-Experts architectures proposing expert path perspective to understand token routing patterns across layers.
CALM: method for heterogeneous treatment effect estimation combining RCT and observational study data with covariate mismatch.
Theoretical analysis of pattern formation in diffusion models explained via out-of-equilibrium phase transitions.
MeanFlow-based learning approach for controlling large-scale swarms with limited sampled-data updates.
LLM-ODE: uses LLMs to discover governing equations of dynamical systems from data, improving on genetic programming approaches.
ALMAB-DC: sequential experimental design framework combining active learning, multi-armed bandits and distributed computing for black-box optimization.
Investigates interpretability of VAEs across modalities, showing image-domain causal circuits fail to generalize to tabular data.
Uncertainty quantification methods for distribution-to-distribution flow matching models in scientific imaging applications.
CellFluxRL: reinforcement learning post-training approach for virtual cell models with biologically-constrained generative models.
Causal discovery method for chain-reaction systems using interventional data on cascade-structured dynamical systems.
Federated learning approach combining Byzantine robustness and differential privacy for distributed training.
Framework using LLMs to automatically design auxiliary reward programs for cooperative multi-agent reinforcement learning systems.
Compares LLM agents against classical hyperparameter optimization algorithms using autoresearch testbed for tuning small language models.
Theoretical analysis of online convex optimization with two-point bandit feedback achieving tight regret bounds.
ATLAS-RTC: runtime control system for LLM agents enforcing structured output via token-level monitoring, biasing, masking and rollback.
HISA: hierarchical indexing system for efficient sparse attention in LLMs, reducing indexer bottleneck in token-level sparse mechanisms.
Develops interpretable ML framework for detecting low left ventricular ejection fraction from ECG data.
Applies Vision-Language Models to chip floorplanning macro placement optimization tasks.
Introduces HyperP, hypersphere parameterization for LLM scaling with improved stability and hyperparameter transfer.
Proposes time-varying momentum schedule derived from critically damped harmonic oscillator for neural network training optimization.