NCCL EP: Towards a Unified Expert Parallel Communication API for NCCL
NCCL EP, unified communication API for mixture-of-experts architectures in large language models built on NCCL with GPU-initiated RDMA.
NCCL EP, unified communication API for mixture-of-experts architectures in large language models built on NCCL with GPU-initiated RDMA.
Training-free fine-grained visual recognition using large vision-language models with sample-wise adaptive reasoning for subordinate-level category disambiguation.
Video world models for robotics using inverse dynamics rewards to align generated trajectories with executable robot actions.
Systematic analysis of Elastic Weight Consolidation for continual learning showing suboptimal performance and proposing improvements to weight importance estimation.
Benchmark comparing PETNN, KAN, and classical deep learning models on Burmese handwritten digit recognition dataset.
Mi:dm K 2.5 Pro, 32B parameter enterprise LLM supporting multi-step reasoning, long-context understanding, and agentic workflows in Korean and domain-specific applications.
Formal specification for admission control governance of autonomous agents in institutional B2B environments with cryptographic validation.
Graph-based memory system for LLM reward prediction requiring limited labeled data for reinforcement learning post-training.
Theoretical analysis of multiplicative updates for matrix mechanism convergence in private machine learning optimization.
Aerial visual localization method for dense urban environments using building silhouette alignment.
Diffusion Transformer framework for virtual try-on and try-off tasks combining both into unified model for fashion applications.
Research on chain-of-thought faithfulness in LLMs showing measurement methodology significantly affects reported faithfulness scores across 12 open-weight models.
KidGym: 2D grid-based reasoning benchmark evaluating MLLMs on spatial intelligence inspired by Wechsler Intelligence Scales.
CRoCoDiL: Continuous semantic space diffusion model for non-autoregressive language generation with improved coherence.
arXiv paper on voice privacy using attribute-based perspective to measure speaker anonymization effectiveness.
Industrial-scale RAG framework evaluated on automotive manufacturing requirements engineering with unstructured heterogeneous documentation.
Memory-Keyed Attention: Efficient attention mechanism reducing KV cache memory for long-context LLM inference and training.
LPNSR: Diffusion-based image super-resolution using LR-guided noise prediction. Addresses efficiency-quality trade-off.
TRACE: Multi-agent system using autonomous reasoning for seismological analysis of earthquake mechanisms from geophysical observations.
Unsupervised self-evolution training framework for multimodal LLMs achieving reasoning improvements without annotated data.
DeepXplain: Explainable deep reinforcement learning framework for multi-stage APT cyber defense with provenance graphs.
Case study on LLM-powered workflow optimization for multidisciplinary software development in automotive industry.
mSFT: Iterative algorithm addressing overfitting in multi-task supervised fine-tuning by heterogeneous data mixture optimization.
arXiv paper on smartwatch-based badminton stroke evaluation using wearable sensors and machine learning.
arXiv paper on hyperbolic embeddings for vision-language models to capture hierarchical part-whole relationships.
arXiv paper on UAV navigation using cross-view geo-localization in GNSS-denied environments. Computer vision focus.
arXiv paper on safe offline reinforcement learning with budget constraints. Addresses safety-reward trade-offs in sequential decision making.
Research on synthetic data generation using LLMs to improve smaller model fine-tuning. Analyzes diversity and distribution in embedding space.
Uncertainty estimation method for LLMs using intra-layer local information scores from cross-layer agreement patterns.
Sparse Feature Attention method reducing transformer self-attention complexity through feature-level sparsity instead of sequence-level sparsity.
Mathematical framework interpreting LLM hidden states as points on latent semantic manifolds with Riemannian geometry.
K-means clustering for career guidance pathway adaptation based on individual trait combinations.
Training-free hallucination detector for LLMs using sample transform cost to measure output distribution complexity.
Progressive Quantization method for robust vector tokenization in multimodal LLMs and diffusion models.
Chinese financial news dataset and benchmark for evaluating LLMs as autonomous agents in macro and sector asset allocation.
Diffusion model-based approach for seismic full-waveform inversion that regularizes nonlinear inverse problems.
UniFluids: conditional flow-matching framework using diffusion Transformers to unify learning solution operators across diverse PDEs.
Multi-modal CNN-LSTM framework with attention and focal loss for real-time elderly fall detection.
Systematic bias correction methods for AI-based tropical cyclone track and intensity forecasting.
Decision Transformer approach for offline emergency vehicle signal preemption optimization without online exploration.
Scalable spatial-temporal diffusion model for long-duration group dance generation from music for entertainment applications.
Geometric Mixture-of-Experts framework for graph representation learning that adaptively routes node representations across multiple Riemannian manifolds.
Graph neural network method using message-passing transformers for chemical mechanism reduction in combustion simulations.
Physics-informed Schrödinger Bridge approach for data assimilation with sparse observations in PDE-governed systems.
Machine learning methods to emulate climate models and reduce computational costs while maintaining scientific validity.
Infrastructure framework for governance and post-market surveillance of adaptive medical AI systems under FDA and EU regulations.
Multi-task deep learning framework for predicting lithium-ion battery state-of-health and remaining useful life in electric vehicles.
Delta-Aware Quantization framework for post-training LLM weight compression that preserves knowledge by protecting small-magnitude parameter deltas during quantization.
Classification method for wind power ramp event forecasting addressing severe class imbalance in grid stability systems.
Method for adding trained persistent memory to frozen decoder-only LLMs using memory adapters in latent space.