The Diffusion-Attention Connection
Theoretical connection between Transformers, diffusion maps, and magnetic Laplacians through Markov geometry.
Theoretical connection between Transformers, diffusion maps, and magnetic Laplacians through Markov geometry.
Framework for evaluating fairness and equity across patient subgroups in brain tumor segmentation models.
Study of deliberative alignment for deeper safety in reasoning LLMs via attribution analysis.
Analysis of working memory limitations in LLMs and comparison with biological systems.
RWKV-based RL approach with explicit belief state representation for partial observability problems.
Computational model using Transformer self-prior to simulate mirror self-recognition behavior without external rewards.
Theoretical analysis comparing entropy regularization and covariance-based mechanisms for controlling policy collapse in RL-enhanced LLMs.
Framework for robust structured prediction using Tsallis reweighting and task-agnostic prompting with XML structure for group-robust fine-tuning.
Guide-Core Policies framework for black-box LLM agents where guide models generate structured strategies executed by core models reducing inference costs.
Review of explainable AI mechanisms for human activity recognition in healthcare, assistive living, and smart environments applications.
Unified visual encoding and decoding framework from neural activity modeling consistency between brain stimulus prediction and reconstruction.
Self-supervised learning on satellite imagery predicting mycorrhizal fungal biodiversity at landscape scales for ecosystem monitoring.
Analysis of preference encoding in looped transformer internal states using lightweight evaluator heads on RLHF dataset.
Methods for personalizing generative user interfaces addressing subjective preferences through preference divergence analysis and sparse feedback.
Self-supervised semantic enrichment method for medical vision-language datasets addressing reporting bias in radiology reports.
Multimodal EHR model for pediatric emergency triage using modality dropout to improve generalizability across demographics.
Convergence analysis of randomized Kaczmarz and SGD with greedy step size proving O(1/t^3/4) last-iterate convergence rate.
Temporal-aware network improving simultaneous speech translation policy to balance quality and latency with temporal context awareness.
Sampling methods for diffusion language models balancing speed, quality, and diversity through tempered confidence-based remasking.
Hybrid condition monitoring framework combining data-driven learning with physics-based insights for industrial systems reliability.
Physical reservoir computing inspired by biological vestibular system addressing hardware complexity with designed uncoupled topology.
Fine-tuning small language models for domain-specific code generation in production environments with strict latency requirements.
Kaczmarz-based preference learning algorithms for real-time matchmaking with stable convergence replacing recency-biased normalization.
Extension of Muon optimizer reducing computational overhead in foundation model pre-training through adaptive second-moment preconditioning.
Decentralized learning framework combining adaptive gradients and compressed communication for federated settings with multiple local training steps.
First benchmark for multi-source domain generalization in automatic sleep staging with noisy labels across institutions and devices.
Closed-form method for concept erasure in diffusion models using double projections without iterative optimization.
Cross-validated self-attention with denoising for automatic modulation classification under low signal-to-noise conditions.
Characterizes necessity and sufficiency conditions for reward poisoning attacks in reinforcement learning with linear MDPs.
Heterogeneous graph network with critical-path awareness for long-horizon flexible job-shop scheduling using rolling horizon optimization.
Theoretical analysis of why transformers learn optimal DDPM denoiser for multi-token Gaussian mixture models.
Survey on attention sink phenomenon in transformers, covering utilization, interpretation, and mitigation strategies.
Automated DNN optimization for PPG-based blood pressure estimation on resource-constrained wearable devices.
Distributed consensus-based framework for recursive multi-output Gaussian processes in large-scale streaming settings.
Temporally augmented graph attention network for affordance classification from EEG sequential data.
Interprets internal computation of Leela Chess Zero transformer using sparse decomposition to explain grandmaster-level reasoning.
Spatial-temporal graph neural networks for virtual metering in sparsely instrumented district heating networks.
Theoretical bounds on Hessian eigenspectrum for cross-entropy loss in nonlinear neural networks.
Theoretical analysis of asymmetric tensor PCA showing gradient descent benefits from mild over-parameterization.
Studies fairness-aware criteria in automated machine learning frameworks to mitigate bias and discriminatory outcomes.
Multi-head attention fusion network for predicting degradation of industrial machinery operating under changing conditions.
Theoretical analysis proving phase displacement in Kuramoto oscillator networks equals gradient of loss for frequency learning.
Graph neural network with diffusion-contrastive learning for wind nowcasting in regions lacking dense observation networks.
Combines SAINT attention mechanism with tree-based models like XGBoost for improved employee attrition prediction on tabular HR data.
AI agents for optimizing community water distribution systems by scheduling pumps and valves to meet demands while minimizing energy in dynamic real-world environments.
Combines physics-informed neural networks with quantum feature mapping for battery state-of-health estimation across chemistries.
Proposes SGED-TCD framework for lag-resolved causal discovery in multivariate time series with applications to environmental data.
Presents VeriSpecGen for automatic formal specification synthesis from natural language using LLMs with traceability for code verification.
Introduces LIRA method to defend LLMs against jailbreaks, backdoors, and unlearning by training models to align instruction representation.
Proposes CARE-ECG, causal agent-based reasoning framework for explainable ECG interpretation combining LLMs with physiological structure.