A Survey of Data Agents: Emerging Paradigm or Overstated Hype?
Survey examining terminology, definitions, and taxonomy of data agents—autonomous systems orchestrating data and AI for complex data tasks.
Survey examining terminology, definitions, and taxonomy of data agents—autonomous systems orchestrating data and AI for complex data tasks.
LLM-based search agents trained on synthetic entity-centric data using improved reward mechanisms to capture informative near-miss samples.
Brain visual decoding from fMRI using hierarchical architecture for subject-agnostic reconstruction without subject-specific training.
OckBench: Benchmark measuring LLM reasoning efficiency via token usage, revealing up to 5x differences in token length across models.
Data-efficient fine-tuning strategy for adding controllable parameters to text-to-video diffusion models using synthetic data.
Refusal Steering: Inference-time method for fine-grained control over LLM refusal behavior on sensitive topics without retraining.
HiGR: Generative slate recommendation system using hierarchical planning and multi-objective preference alignment for ranked lists.
CogFlow: Multimodal LLM system for visual math problem solving improving visual perception integration and reasoning.
Comprehensive empirical study evaluating factors affecting safety alignment in LLMs and LRMs across 32 recent models.
Fast-ThinkAct: Efficient Vision-Language-Action framework reducing inference latency through verbalizable latent planning.
CLiMB: Domain-informed clustering framework for novelty detection in scientific discovery with application to galactic archaeology.
Molmo2: Open-weight vision-language model with video understanding, grounding, and disclosed training data and recipe.
MetamerGen: Latent diffusion model generating visual scenes aligned with human perception using periphery gist and foveal information.
Physics-encoded inverse modeling approach combining sequential architecture with physics constraints for Arctic snow depth prediction.
FROST: Attention-aware pruning method for efficient LLM reasoning by identifying and removing reasoning outliers while preserving capacity.
Persona Brainstorm Audit method for detecting bias and fairness issues in open-ended creative outputs from LLMs.
CryoLVM applies self-supervised learning with vision models to cryo-EM density maps for structural biology analysis.
Study showing SGD with sparsity outperforms Adam for RL from verifiable reward in LLM training, challenging standard optimization practices.
AceGRPO: Reinforcement learning agent for autonomous ML engineering using adaptive curriculum and group relative policy optimization to overcome parameter freezing.
Study on paraphrase generation and detection as mechanisms for language understanding and modeling in neural networks.
GOT-Edit improves object tracking in video by incorporating 3D geometric cues and semantic reasoning alongside 2D features.
SAS-Net addresses spatiotemporal misalignment in bidirectional photoacoustic microscopy using scene-appearance separation for medical imaging.
UI-Venus-1.5: GUI agent with 2B, 8B, and 30B-A3B variants for automating digital environment interactions with broad generality and strong task performance.
VESPO: reinforcement learning method for training LLMs with improved stability through soft policy optimization and importance sampling to address policy divergence.
KBVQ-MoE: compression technique for Mixture of Experts LLMs using vector quantization and SVD to reduce parameter size and memory for resource-constrained deployment.
PMG algorithm for humanoid locomotion using parameterized motion generation with trajectory-following control.
Sim2Radar uses VLM-guided scene reconstruction to synthesize radar training data from RGB images, addressing radar dataset scarcity.
Pawsterior framework extending variational flow matching for simulation-based inference with structured domains and physical constraints.
Identifies and analyzes 'silent inconsistency' problem in data-parallel LLM fine-tuning where gradient synchronization doesn't ensure worker-level optimization alignment.
ST-EVO framework for self-evolving LLM-powered multi-agent systems that dynamically construct task-adaptive communication topologies instead of predefined structures.
Studies mechanical basis of capability emergence in neural networks across scales 405K-85M parameters, finding scale-invariant representation collapse and top-down reorganization.
Proposes AI-CARE metric incorporating carbon emissions and energy consumption alongside standard performance metrics for ML model evaluation.
Applies physics-constrained whole-pattern expectation-maximization algorithm for AI-driven refinement of X-ray diffraction crystal structures.
Analyzes quality issues in AI safety datasets, finding they rely on superficial 'triggering cues' rather than genuine adversarial patterns.
Randomized trial showing AI-generated feedback suggestions via FeedbackWriter improve student revisions when reviewed by human TAs in economics course.
Proposes symbolic alternative to GNN message-passing for more interpretable and expressive graph learning in high-stakes domains.
MASPO algorithm improves LLM reasoning through reinforcement learning with verifiable rewards, addressing gradient utilization and probability mass issues in existing RLVR methods.
ML method using random walk learning on cortical similarity networks for diagnosing Alzheimer's and Lewy body dementia.
User study comparing chatbots, games, and essays for persuasive learning on sustainability topics with identical content.
Uses LLM-assisted reasoning to map 2D engineering drawing annotations to 3D CAD features for manufacturing automation and process planning.
Benchmark study (SP-ABCBench) evaluating whether LLM agents can simulate human security and privacy attitudes and behaviors for risk forecasting.
Examines integration of AI into science education materials, covering personalization, adaptive instruction, and accessibility in K-12 learning contexts.
Proposes Reasoning Processing Unit (RPU) architecture to address memory bandwidth bottlenecks in LLM inference, particularly for reasoning applications with long outputs.
Phenomenological analysis of ML through Heideggerian philosophy. Theoretical framework rather than technical or practical contribution.
Neural spatiotemporal architecture for air quality forecasting in Delhi using four years of atmospheric data across sixteen spatial grids.
Study revealing performance degradation when using state-of-the-art text-to-image models as synthetic training data generators for vision tasks.
Benchmark suite for evaluating video reasoning capabilities in modern video models including spatiotemporal reasoning and scaling behavior.
Tensor network generator-enhanced optimization framework applying Born machines for solving traveling salesman problem combinatorial optimization.
MoBiQuant enables elastic LLM deployment with token-adaptive mixture-of-bits quantization supporting dynamic precision switching at runtime.
Federated learning framework for estimating continuous-time Markov chain hazard model of bridge deterioration without sharing raw inspection data.