Rethinking Attention Output Projection: Structured Hadamard Transforms for Efficient Transformers
Replaces dense attention output projections with fixed Walsh Hadamard Transform to reduce parameters and inference cost in transformer models.
Replaces dense attention output projections with fixed Walsh Hadamard Transform to reduce parameters and inference cost in transformer models.
Method for generating plausible counterfactual explanations for time series classification using soft-DTW alignment with k-nearest neighbors.
Novel optimization algorithm using fractional calculus to handle noisy weight updates and improve gradient descent performance on imbalanced datasets.
Stable offline multi-agent reinforcement learning approach addressing instability in non-linear value decomposition for MARL from batched data.
Training-time regularization method generating virtual outliers in feature space to improve out-of-distribution robustness in image classification networks.
SYNAPSE framework for neuron-level interpretability and perturbation analysis in sequence encoding models to improve transparency and robustness.
Dynamic scaling framework for class incremental learning that adaptively manages architectural growth and memory overhead while preventing catastrophic forgetting.
LycheeCluster method for efficient long-context LLM inference using structure-aware chunking and hierarchical KV cache indexing to reduce attention complexity.
Data-driven approach for uncertainty-aware deterioration risk prediction using multimodal data in clinical decision support systems.
Recasts efficient chain-of-thought prompting in LLMs as compression problem under Information Bottleneck principle to reduce token usage and inference cost.
Structure-preserving neural network operator inference framework for non-intrusive reduced-order modeling of dynamical systems from snapshot data.
Framework for efficient credal prediction using decalibration to represent epistemic uncertainty in safety-critical machine learning applications.
Safe move prediction in chess using oracle-guided soft shielding to combine imitation and reinforcement learning for reducing safety-critical errors.
Novel approach to multi-objective reinforcement learning using concave scalarization to optimize nonlinear utility functions over multiple reward objectives.
Unsupervised graph alignment method using deep learning and optimal transport to find node correspondence across different graphs without labeled pairs.
Research on learning compact state representations in reinforcement learning using Laplacian eigenvectors to address dimensionality challenges in large-scale RL problems.
Drift2Act controller for handling distribution drift in deployed ML systems with budgeted interventions and online risk certificates.
DualFlexKAN extends Kolmogorov-Arnold Networks with learnable dual-stage functions to address quadratic parameter scaling and architectural limitations.
Streaming deep reinforcement learning method for continuous control on resource-limited hardware using online updates without replay buffers.
MAGIC Net approach for streaming continual learning that combines architectural strategies with RNNs to handle concept drift and temporal dependence.
Mathematical derivation of integral formulas for vector spherical tensor products and Clebsch-Gordan coefficients.
Function-preserving expansion method for fine-tuning pre-trained models without catastrophic forgetting by replicating model capacity.
Input space partitioning architecture using data heterogeneity measures to improve supervised learning accuracy on mixture-of-distributions data.
Theoretical framework connecting group theory and group entropies to mirror descent optimization algorithms for flexible machine learning updates.
Self-conditioned GAN approach for unsupervised trajectory forecasting that learns different behavioral modes from 2D trajectories.
Comprehensive analysis of unsupervised reinforcement learning with verifiable rewards for scaling LLM training without supervision bottleneck, including taxonomy and experiments.
AI-guided evolutionary search approach to improve the Random Offerer mechanism in bilateral trade, analyzing gains from trade efficiency bounds.
Split Federated Learning architecture optimization for improving training accuracy and reducing delay in distributed ML model training.
Impermanent benchmark for evaluating temporal generalization in time-series forecasting models, addressing data contamination issues in foundation models.
XInsight multi-agent framework for LLM-driven psychological counseling support with stage-consistent workflow aligned to therapeutic practices.
Isotonic Layer framework for model calibration and debiasing in recommendation systems using piecewise linear fitting and monotonic constraints.
Research on unified understanding of phenomena in Transformer language models through hierarchical latent structures in data generation processes.
Hierarchical Embedding Fusion method for retrieval-augmented code generation that compresses repositories into dense vector hierarchies to reduce inference cost and context noise.
Multi-agent deep reinforcement learning for radio resource allocation in V2X networks, addressing challenges like non-stationarity, coordination, and partial observability.
Open-source StarCraft II benchmark for reinforcement learning research with accessible compute requirements and curriculum design capabilities.
DeepScope: Deep learning system for rapid water safety testing via microscopic image inference, eliminating pathogen incubation.
GraphSkill: LLM-based approach for complex graph reasoning using retrieval-augmented code generation with documentation guidance.
Exploration Space Theory: lattice-theoretic framework for location-based recommendation systems with prerequisite dependencies.
RECAP: Reservoir computing approach using local Hebbian plasticity for image recognition, inspired by biological neural mechanisms.
Research on risks of pruning-based unlearning in diffusion models, identifying concept revival dangers in weight pruning approaches.
Comprehensive review of quantum deep learning approaches integrating quantum/quantum-inspired resources with deep learning.
Study evaluating how graph construction methods affect GNN performance for IoT botnet detection.
Graph-based approximate nearest neighbor search optimized for modern AI workloads with online insertion support.
Augmentation technique for human motion sequence analysis using ensemble methods respecting kinematic constraints.
HyperTokens generates task-specific prompt tokens for continual video-language understanding with multimodal LLMs.
Machine learning approach for unmixing hyperspectral infrared imagery of historical oil painting cross-sections.
Graph neural network approach for muon particle momentum estimation in CMS trigger systems at the Large Hadron Collider.
Hybrid few-shot learning model combining XAI and FSL for plant leaf disease classification under limited training data.
Parallel Relative Policy Optimization for improving chart understanding in large vision-language models with deep reasoning capabilities.
Image reconstruction using unsupervised learning for beam diagnostics in particle accelerators with severe degradation and noisy data.