Audio Spatially-Guided Fusion for Audio-Visual Navigation
Audio-visual navigation system for autonomous agents to localize and navigate toward vocalizing targets in 3D environments.
Audio-visual navigation system for autonomous agents to localize and navigate toward vocalizing targets in 3D environments.
Deep learning framework for predicting wireless channel characteristics in vehicular 6G communications using visual feature fusion.
Privacy-preserving group emotion recognition model using variational encoder-multi-decoder architecture without per-person feature extraction.
Approach using LLMs to detect and repair errors in MPI code for high-performance computing and distributed training frameworks.
LumiVideo agentic system mimicking professional video colorists' workflows with interpretable iterative control for automated color grading.
Research on deep generative models (diffusion, flow matching) for high-dimensional distributions on constrained submanifolds in physics data.
Self-Directed Task Identification framework enabling models to autonomously identify target variables in zero-shot learning without pre-training.
Framework using Mixture-of-Gaussians trajectory prediction for diverse multi-agent play generation in team sports.
Survey of deep learning approaches for diabetic retinopathy detection addressing dataset limitations and geographic diversity issues.
Research investigating whether frontier reasoning models are necessary for mathematical proof verification versus smaller LLM judges.
NLP research on skeleton-based coherence modeling for narrative generation and detection of incoherent story structures.
Empirical evaluation of LLMs as behavioral simulators for predicting intervention effects across 11 climate-psychology interventions using 59,508 participants.
Research studying geometric structure of layer-wise updates in deep language models across Transformer and state-space architectures.
VERTIGO system for cinematic camera trajectory generation with visual preference optimization for realistic shot composition.
Hierarchical Interpretable Label-Free Concept Bottleneck Model enabling interpretability at multiple abstraction levels unlike single-level existing CBMs.
Diffusion-based foundation model generates synthetic satellite imagery for wildfire detection without task-specific retraining.
Transformer-based framework using Vision Transformer for predicting fluid flows in energy systems, applied to gas injection phenomena.
Zero-shot malware family classification using weighted hierarchical ensembles of LLMs, avoiding need for labeled datasets and handcrafted features.
Image Prompt Packaging method to reduce token costs in multimodal LLMs by embedding structured text into images, benchmarked across frontier models.
Vision-language model for lumbar spinal stenosis diagnosis from MRI with adaptive loss function for class imbalance handling.
Study of social meaning in LLMs, introducing calibration metrics and pragmatic prompting strategies to improve quantitative approximation of human reasoning.
Unified framework for deriving sparse Bayesian learning algorithms using neural networks and majorizer learning.
System for private long-term memory in personal AI using trusted hardware and oblivious RAM to hide data access patterns from providers.
Theoretical and empirical evaluation of using LLM-generated preferences to warm-start contextual bandits, examining alignment with actual user preferences.
Analysis of stability in post-hoc feature attribution methods for vision systems under input perturbations, introducing evaluation suite.
LLM-based code generation for security vulnerabilities using CAPEC and CWE frameworks, addressing gaps in existing vulnerability datasets.
Study of cultural bias in LLM text generation, introducing task of culturally-adapted artwork descriptions for different audience groups.
Integrative review of generative AI impact on entrepreneurship across opportunity recognition, evaluation, resource assembly, and venture launch stages.
Research on safety alignment vulnerabilities in LLMs, examining jailbreak-tuning and weight orthogonalization methods that can disable safety guardrails.
Comparative study of LLM vs human coordination in group games, revealing volatility and action bias differences in adaptive strategies.
Vision-language model extension for referring image segmentation using autoregressive decoding and reinforcement learning refinement.
System grounding LLM-generated explanations in formal representations to enable interactive exploration of mathematical proofs.
Tool for developing research ideas through dynamic literature contextualization and critique using LLMs.
Security analysis of memory-based LLM web agents, demonstrating environment-injected poisoning attacks through persistent memory exploitation.
Vision foundation model applied to rapid building damage mapping from post-earthquake imagery for disaster response.
Continual graph learning method addressing feature drift in non-exemplar settings using analytic continual learning.
Self-supervised depth estimation for articulated vehicles using cross-vehicle 3D geometric consistency.
Game benchmark with 124 bugs for evaluating LLMs' ability to autonomously discover bugs as QA engineers in dynamic environments.
Distributed training approach for graph neural networks using communication-free sampling and hybrid parallelism.
Theoretical analysis of reinforcement learning alignment limitations in LLMs, demonstrating generalization failures through compound jailbreak attacks.
Efficient model compression using randomized subspace iteration for low-rank decomposition of pretrained models.
Study of sycophancy propagation in multi-agent LLM systems, examining how agents' awareness of others' biases affects collaborative discussions.
Large-scale empirical study of coordination dynamics in LLM multi-agent systems, analyzing scaling behavior and power laws in collective cognition.
Agentic framework using LLMs for automated clinical trial evidence synthesis and meta-analysis with eligibility-aware study selection.
Using sparse autoencoders to understand geometric structure of belief representations in transformer models and LLMs.
Token-space adversarial attacks on reward models used in RLHF, introducing token mapping perturbation attack paradigm beyond semantic manipulation.
Framework for reducing computational overhead in 3D multimodal LLMs through adaptive token reduction for resource-constrained deployment.
AI agent system for document forgery detection using evidence-grounded reasoning, combining detection, localization, and explanation for document safety.
Controlled replication study examining vocabulary constraints versus linguistic structures in LLM reasoning, testing E-Prime effects on cognition.
Systematic evaluation framework for LLM formal reasoning capabilities using Chomsky hierarchy and computational complexity theory.