[b]=[d]-[t]+[p]: Self-supervised Speech Models Discover Phonological Vector Arithmetic
Analysis of self-supervised speech models across 96 languages showing phonological vector arithmetic structure.
Analysis of self-supervised speech models across 96 languages showing phonological vector arithmetic structure.
Machine learning method for engineering adeno-associated viral capsids for gene therapy delivery.
Case study using ChatGPT-5 thinking mode to resolve mathematical conjecture on spectral regions.
Analysis of why LLM caching methods fail and proposes intent canonicalization with few-shot learning for cost reduction.
Convergence analysis of Stochastic Mirror Descent with matrix parameters in overparameterized regime.
Knowledge distillation method transferring histopathology embeddings to micro-ultrasound for prostate cancer grading.
Analysis of language agent failures on tool-use tasks caused by canonical path deviation rather than capability limitations.
Philosophical phenomenological analysis of machine learning through Heideggerian concepts.
IAPO: information-theoretic post-training framework for optimizing token efficiency in LLM reasoning chains.
Framework for photorealistic 3D human animation combining kinematics with video diffusion priors.
Event-triggered gossip framework for distributed learning that reduces inter-node communication overhead.
CaReFlow method for multimodal fusion using rectified flow to address modality gap.
Generative framework using diffusion models for simulating point defects in inorganic solids.
Mechanistic interpretability study of how LLMs represent and use valence-related information in decision-making tasks.
Deep learning method for facial expression manipulation and data augmentation using controllable image editing.
Benchmark for Multi-Agent Reinforcement Learning algorithms on urban energy management tasks using CityLearn environment.
Mechanistic analysis of procedural hallucinations in LLMs showing attention and readout-stage routing errors cause value retrieval failures.
Theoretical study of how low-precision training affects scaling laws in high-dimensional linear regression with implications for quantization.
Uses soft mixture-of-experts RL for exploration in directed controller synthesis with improved zero-shot generalization to larger systems.
Bayesian nonparametric approach for predictive maintenance handling unknown failure modes in manufacturing without labeled data.
Proposes metasurfaces-integrated wireless networks for edge inference in 6G IoT systems with low latency and energy efficiency.
TOPReward uses token probabilities from Vision-Language-Action models as zero-shot reward signals for robotics reinforcement learning without task-specific training.
US-JEPA adapts Joint-Embedding Predictive Architectures for medical ultrasound by predicting masked latent representations instead of raw pixels.
Panel analysis quantifies relationship between forest loss and carbon emissions at subnational US scales using econometric methods.
SplitLight open-source toolkit addresses reproducibility issues in recommender systems evaluation by exploring dataset preparation and splitting strategies.
MentalBlackboard benchmark evaluates Vision-Language Models on spatial visualization tasks like paper folding and hole punching.
Vid2Sid uses video data to calibrate robot simulator physics parameters, addressing sim-to-real gap without manual tuning or black-box optimization.
Value-guided multi-path reflection method for optimizing Vision-Language Models in complex robotic manipulation through improved planning and reasoning.
Multi-armed bandit approach for adaptive data augmentation to improve implicit pattern recognition in vision and language models.
Cybersecurity framework using CNN-LSTM model for biometric and environmental data analysis for context-aware security decisions.
Foundation model for molecular chemistry extending MACE with explicit long-range electrostatic interactions and charge transfer modeling.
IR³ framework detects and mitigates reward hacking in RLHF by reverse-engineering and interpreting internalized objectives in LLM alignment.
OptiRepair uses LLM agents to diagnose and repair infeasible supply chain optimization models through closed-loop diagnosis and domain-agnostic repair.
Laplacian multiscale flow matching framework for image generation using mixture-of-transformers with causal attention.
Sequential correction algorithm for physics-informed neural networks, improving training efficiency and accuracy for PDE solving.
Human-guided agentic AI system for multimodal clinical prediction, combining autonomous workflows with domain expertise on AgentDS Healthcare benchmark.
Feature caching approach for diffusion transformers using relational information to accelerate inference beyond temporal extrapolation.
Hierarchical Mixture-of-Agents architecture using lightweight router for cost-optimized LLM inference trading off accuracy and computational expense.
Adaptive Rejection Sampling framework for selective reasoning in LLMs, reducing token waste on simple requests while maintaining chain-of-thought benefits.
Cost-aware active search algorithm for autonomous agents to balance exploration and exploitation with unknown target recovery.
Evaluates robustness of AI age estimation systems to cosmetic modifications using simulated physical changes.
Investigates HTML-to-text extraction methods for LLM pretraining datasets, showing single extractors lead to suboptimal web data coverage.
Active data acquisition algorithm using inverse curvature for deep neural networks without explicit posterior inference.
Framework for high-dimensional generative modeling using manifold learning to balance support fidelity and sampling efficiency.
Design principles for integrating LLMs into automotive system engineering with focus on trustworthiness and verification in safety-critical pipelines.
Proposes denoising particle filters for robot state estimation trained on single-step objectives rather than sequence unrolling.
Federated learning approach for personalized longitudinal medical report generation respecting privacy with temporal dynamics.
SkillOrchestra system for routing across multiple AI agents via learned skill transfer, reducing routing collapse in multi-turn conversations.
Vision-Language-Action models with universal pose pretraining for improved 3D state perception and embodied AI tasks.
Ensemble machine learning methods for dynamic time-to-event predictions in precision medicine applications.