Physics-Informed Neural Engine Sound Modeling with Differentiable Pulse-Train Synthesis
Differentiable neural synthesis architecture for engine sound modeling using physics-informed pulse-train resonators.
Differentiable neural synthesis architecture for engine sound modeling using physics-informed pulse-train resonators.
Competition for end-to-end document image translation combining OCR and NLP for complex layout preservation.
Fully convolutional diffusion model using ConvNets for efficient generative modeling compared to transformer alternatives.
Prompt-based document layout analysis framework using domain-specific descriptive knowledge for improved multi-domain generalization.
Benchmark evaluating gender stereotypes in LLMs across healthcare contexts with intersectional social determinants of health factors.
Motion forecasting system for autonomous vehicles handling open-world scenarios with imperfect perception and evolving object taxonomy.
Benchmark dataset and analysis of LLM bias showing models prioritize moral reasoning over commonsense knowledge.
AI agent framework for automated clinical target volume delineation in radiotherapy that adapts to guideline changes without retraining.
Vision-language-action model for autonomous driving combining perception and planning distillation to improve stability.
Normalizing flows framework for time series anomaly detection with temporal conditioning and uncertainty quantification.
Vision-language model adaptation method using evolutionary prompt learning to prevent catastrophic forgetting while maintaining parameter efficiency.
Research on portable O(1) autoregressive caching for state-space models via XLA compilation, removing hardware-specific kernel dependencies.
Research on online continual learning in transformers using routing mechanisms without catastrophic forgetting in non-stationary data streams.
Research on biologically-inspired learning algorithm addressing backpropagation limitations for complex temporal pattern recognition in cortex-like systems.
Framework for interpretable synthetic data generation using vision-language models with grounded evaluation metrics for downstream tasks.
Research on persona-adaptive prompting for evaluating multi-modal LLM agents in customer experience scenarios with dual-control interactions.
Training-free KV-Lock framework for video diffusion models improving foreground quality while maintaining background consistency.
Open-source framework for time series anomaly detection using graph neural networks with critical evaluation and standardized benchmarks.
ML research benchmarking three paradigms for automated cardiac risk classification from unstructured electronic health records using large-context LLMs.
Large-scale Vietnamese VQA dataset automatically constructed using pre-trained transformers.
Unified instruction-tuning framework for task-oriented dialog systems using schema-aware prompting.
Active learning pipeline for efficient preference data generation to improve RLHF alignment of LLMs.
Benchmark and improvement strategies for multi-audio understanding in large audio-language models.
LLM-based method for generating actionable peer review feedback using rebuttal data as supervision.
Benchmark for evaluating multimodal LLMs on egocentric scene prediction with long-horizon action reasoning.
Personalization framework for vision-language models enabling customized AI assistants without additional training.
Simulation-based inference approach for estimating neutrino interaction parameters in physics experiments.
Quantum-classical hybrid framework for financial volatility forecasting using quantum circuit models.
Adaptive channel pruning for split learning to reduce communication overhead in federated training.
RAG-based AI assistant prototype for knowledge retrieval across large scientific collaboration documentation.
Lightweight pseudo-projector module to improve transformer robustness by correcting hidden representations.
QA task over multi-agent egocentric video data for human-AI collaboration scenarios.
Benchmark suite for evaluating large audio language models on audio understanding tasks beyond speech recognition.
Hierarchical graph attention network for spectrum demand prediction using geospatial data.
Dynamics-aware policy learning for robotic manipulation in cluttered scenes using non-prehensile contact.
Memory-aware replay strategy for continual LLM fine-tuning to prevent catastrophic forgetting during sequential training.
Data-driven ML approach for forecasting spectrum demand in wireless networks.
Framework for synthesizing missing brain imaging modalities using diffusion models for Alzheimer's diagnosis.
Data-driven methodology for characterizing spectrum demand patterns across space and time in 6G networks.
Direct cardiac analysis from undersampled k-space MRI data avoiding intermediate image reconstruction step.
Analysis of learning rate sensitivity in PPO actor-critic RL using hidden neuron behavior and overfitting metrics.
Neural debugger training LLMs on Python execution traces to enable line-by-line execution prediction for developer assistance.
BEACON predicts navigation affordances from language instructions and visual observations, handling occluded regions via vision-language models.
Agent-based model extending bee equation with emotional valence and arousal for swarm collective decision-making.
User study evaluating LLM-powered sighted guide for making social VR accessible to blind and low vision people.
Mechanistic interpretability study of how feature correlations shape superposition in neural networks beyond sparse, uncorrelated settings.
Daily-Omni benchmark for audio-visual reasoning with temporal alignment in multimodal LLMs using 684 real-world videos.
MMGraphRAG extends GraphRAG to multimodal knowledge graphs preserving visual structure, reducing LLM hallucinations in vision-language tasks.
CMASE framework combining generative agent-based modeling with virtual ethnography for interactive social simulation research.
VistaWise framework integrating cross-modal domain knowledge graphs with fine-tuned smaller models for cost-effective embodied agents in Minecraft.