DASH: Dynamic Audio-Driven Semantic Chunking for Efficient Omnimodal Token Compression
Token compression method for omnimodal LLMs using dynamic audio-driven semantic chunking to reduce inference costs for audio-visual processing.
Token compression method for omnimodal LLMs using dynamic audio-driven semantic chunking to reduce inference costs for audio-visual processing.
Domain adaptation approach for remaining useful life prediction using evidential learning under incomplete degradation trajectories.
Study on engineering challenges in LLM-based multi-agent systems, addressing context pressure, coordination errors, and system drift at scale.
Defense framework against backdoor attacks in LLMs using trigger generation and inversion to locate and mitigate malicious triggers.
Study on over-smoothing in hypergraph neural networks using Ricci flow theory to improve message passing and layer depth handling.
Research on using inference time as a proxy to estimate LLM energy consumption, addressing opacity in API-based model access and environmental impact.
SEMAG: self-evolutionary multi-agent code generation framework that decomposes programming tasks into planning, coding, debugging stages with adaptive workflow selection.
Uncertainty-guided multi-expert framework for imbalanced sequence learning addressing poor expert specialization and prediction conflicts in long-tailed data.
Retrieval-augmented generation framework using GPT-4 to accelerate CO2 reduction catalyst discovery by exploring chemical spaces and interpreting results.
Method bridging learned embeddings and handcrafted features in event sequences for financial systems, addressing interpretability and latency constraints in production ML.
Large-scale competition analysis revealing LLM agents' vulnerability to indirect prompt injection attacks through adversarial instructions in external content sources.
Framework and prototype for navigable dataset map in engineering design and systems engineering to improve data accessibility and research reproducibility.
Meta-TTRL: metacognitive test-time reinforcement learning framework for unified multimodal models enabling knowledge accumulation across similar prompts in text-to-image generation.
MiroThinker-1.7 and H1: research agents with enhanced verification and multi-step reasoning via structured planning and contextual reasoning for long-horizon tasks.
ClawWorm: first documented self-propagating attack across LLM agent ecosystems, demonstrating security vulnerabilities in OpenClaw platform with 40,000+ active instances.
Technique improving pretrained diffusion/flow-matching robot policies by replacing sampled noise with optimized constant vectors for better downstream reward performance.
Simulation Distillation method enabling sim-to-real transfer in robotics by pretraining world models in simulation for rapid real-world adaptation with low data.
CorrectionPlanner: autonomous driving planner using reinforcement learning with explicit self-correction mechanism in propose-evaluate-correct loop for unsafe action handling.
Evaluation of how LLMs and tokenizers handle Arabic root-pattern morphology, testing whether models capture genuine morphological structure or rely on surface memorization.
Differentiable framework for computing geodesics on 3D meshes with parallelization support to improve machine learning on non-Euclidean geometric domains.
OMNIFLOW: multimodal agent combining LLMs with physics-grounded reasoning to handle spatiotemporal PDE dynamics without domain-specific fine-tuning, reducing non-physical hallucinations.
Security framework for LLM-based multi-agent systems addressing manipulation risks from malicious agents in interactive agentic networks through communication channel exploitation.
Large-scale empirical study demonstrating prediction-equivalent models produce substantially different feature attributions across 24 datasets, challenging explainability assumptions.
Study showing LLM stability across repeated runs does not guarantee agreement with statistical ground truth in data-constrained scientific decision-making workflows.
Informationally Compressive Anonymization (ICA) method for privacy-preserving ML that protects sensitive data without the performance degradation of differential privacy or homomorphic encryption.
arXiv: Design principles for XAI interfaces enabling scientists to probe and interpret LLM behavior in reading and research workflows.
arXiv: Counteractive RL framework addressing exponential state space complexity for efficient deep reinforcement learning.
arXiv: Study on electrodermal activity as standalone physiological signal for detecting aerobic exercise in wearables.
arXiv: Python library for unit circle-based computing using complex phasors and unitary gates on torus topology.
arXiv: LLM ensemble approaches for word sense plausibility rating in SemEval-2026 using zero-shot and Chain-of-Thought prompting.
arXiv: Framework for Internet of Physical AI Agents addressing interoperability, security, and sustainability in IoT environments.
arXiv: Privacy-preserving federated learning for Alzheimer's classification using 3D MRI with site-aware techniques.
arXiv: Practical guide to AI-assisted research in mathematics and ML, covering productive tool use and responsible guardrails.
arXiv: Analysis of 10,469 experiments by Claude Opus and Gemini agents across 108k design space cells for ML architecture search.
arXiv: VIBEPASS empirically evaluates LLM self-diagnosis and repair capabilities for autonomous software engineering.
arXiv: Benchmarking causal discovery algorithms on synthetic healthcare data for fairness and utility evaluation.
arXiv: LLM-guided neural architecture search for multimodal time-series classification under data-locality constraints for healthcare.
arXiv: LLM family with dynamic tokenizers eliminating fixed vocabulary constraints, up to 70B parameters, improved domain/language adaptation.
MobileLLM-Flash methodology designs on-device LLMs optimized for latency constraints using hardware-in-the-loop architecture search.
ExpertGen automates expert policy generation in simulation for scalable sim-to-real robotic behavior cloning transfer.
MoLoRA enables per-token adapter routing for multimodal generation and mixed-capability requests in multi-adapter serving.
Lightweight proxy models reduce LLM query costs and latency 100x for AI-augmented SQL operations.
Physics-based preprocessing framework standardizes heterogeneous medical images at scale for improved model generalization.
Multi-task RL with chain-of-thought prompting aligns paralinguistic understanding and generation in speech LLMs.
Three-stage framework for dysarthric speech severity estimation using pseudo-labeling and data augmentation.
xr-adaptive-modality platform studies modality-specific interventions for XR interfaces balancing gaze and hand input.
RadAnnotate uses LLMs with retrieval augmentation and selective automation for efficient radiology report annotation.
FormulaCode benchmark evaluates LLM coding agents on repository-level codebase optimization with realistic multi-objective constraints.
FlatLands dataset and benchmark for bird's-eye view floor completion from single egocentric images.
Probing-based analysis of moral reasoning trajectories in LLMs across six models showing systematic multi-framework deliberation.