Multi-Level Causal Embeddings
Framework for causal embeddings enabling multiple detailed models to map into sub-systems of coarser causal models.
Framework for causal embeddings enabling multiple detailed models to map into sub-systems of coarser causal models.
Examines AI agents with persistent state, tool access, and skills for autonomous execution of social science research pipelines.
ConstraintBench evaluates whether LLMs can directly solve constrained optimization problems without solver access.
Study on LLM vulnerability to jailbreak attacks using classical Chinese prompts to bypass safety constraints.
Geometric analysis of cost minimization in shallow ReLU networks with constructive bounds avoiding gradient descent.
Dispatcher/Executor principle for multi-task reinforcement learning using abstraction to improve generalization across tasks.
DirMixE addresses long-tail recognition with unknown test distributions using hierarchical label variation modeling.
R2GenCSR uses LLMs with visual feature extraction from X-ray images for automated radiology report generation.
Speech separation model TIGER reduces parameters and computational costs using time-frequency interleaving for low-latency processing.
Research framework for sparse counterfactual explanations using optimal transport and Shapley values for model interpretability.
Research on temporal retrieval methods for predicting micro-video popularity with volatile engagement dynamics.
Research on robust watermarking techniques for distinguishing generated from real content in generative models.
Research paper on grounding LLMs with real-time financial data for knowledge-aware financial agent applications.
Semantic parallelism technique for efficient MoE LLM inference via model-data co-scheduling reducing communication bottlenecks.
Optimization perspective on reward model quality in RLHF showing accuracy alone doesn't capture effective teacher properties.
Domain decomposition approach for neural operators to improve geometry generalization and transferability in PDE solving.
LLM-empowered hierarchical RIC controller for O-RAN addressing cooperation, computational demands, and domain-specific adaptation.
FineScope framework using SAE-guided data selection for domain-specific LLM pruning and finetuning with maintained performance.
Feature selection method using permutation-invariant embeddings and policy-guided search for complex feature interactions.
Empirical study examining how ML practitioners at Big Tech approach fairness in recommender systems through interviews.
Agentic Predictor using multi-view encoders for performance prediction in LLM-based agentic workflows without exhaustive evaluation.
Method bridging target-free and target-based deep reinforcement learning to reduce memory requirements and improve update propagation.
LiteReality pipeline converting RGB-D scans into compact interactive 3D virtual scene replicas with graphics-ready features.
SMT solver extensions for approximate model counting on hybrid discrete-continuous formulas.
Framework converting generative multimodal LLMs into zero-shot discriminative embedding models without extensive pre-training.
AMBER-AFNO benchmark for lightweight 3D medical image segmentation using Adaptive Fourier Neural Operators.
OM2P offline multi-agent reinforcement learning using flow-based generative models with improved sampling efficiency.
Framework for improving LLM context-aided forecasting with diagnostic tools and reduced computational costs for practical deployment.
AC3 reinforcement learning framework for long-horizon robotic manipulation using continuous action chunking with sparse rewards.
LumiMAS framework for real-time monitoring and observability of multi-agent systems with LLMs, addressing system-wide failure detection.
SAT reduction approach for automating input/output logics, a family of deontic logics for reasoning over norms and obligations.
Latent Self-Consistency: method for reliable majority voting in LLM outputs handling both short and long-form reasoning tasks consistently.
Once4All: LLM-synthesized test generator for SMT solver fuzzing using skeleton guidance to uncover bugs in evolving solver versions.
Veritas: pattern-aware deepfake detection system with HydraFake dataset bridging gap between academic benchmarks and industrial deployment.
Draw-In-Mind: multimodal model rebalancing designer and painter roles to improve precision in image editing tasks.
LLaDA diffusion-based large language model applied to automatic speech recognition with deliberation-based post-processing for Whisper transcripts.
E-CIT: plug-and-play ensemble framework for conditional independence testing to reduce computational bottlenecks in constraint-based causal discovery.
Investigation of in-context learning emergence in world models for environmental dynamics prediction beyond static zero-shot performance.
Study of activation function design's role in preventing plasticity loss during continual learning, beyond catastrophic forgetting.
Meta-weighted online sampling approach for aligning LLMs by reducing distribution mismatch between offline preference data and evolving model policy.
MobileLLM-R1: sub-billion parameter language models with chain-of-thought reasoning and open training recipes, challenging assumptions about model size requirements.
DataMind: scalable data-analytic agent system with open-source training recipes for multi-step reasoning over diverse-format, large-scale data files.
BEV-VLM: trajectory planning approach using vision-language models with bird's-eye-view representations from fused camera and LiDAR data.
VoiceBridge: one-step latent bridge model for general speech restoration from diverse distortions with energy-preserving VAE design.
Max-V1: vision-language model framework for autonomous driving that formulates trajectory planning as next waypoint prediction via language.
FLOP: score-based causal discovery algorithm for linear models using fast parent selection and Cholesky-based updates to find optimal causal graphs.
CMT-Benchmark: 50 expert-level condensed matter theory problems for evaluating LLMs on advanced scientific reasoning and code generation.
Permutation-invariant feature selection method using generative models to capture feature interactions while improving robustness and privacy.
Flow matching variant (Carré du champ flow matching) that improves quality-generalization tradeoff in generative models through geometry-aware noise regularization.
Evaluation of zero-shot super-resolution capabilities in machine-learned operators for continuous modeling in scientific computing.