Beyond Noisy-TVs: Noise-Robust Exploration Via Learning Progress Monitoring
Proposes learning progress monitoring to improve exploration efficiency in reinforcement learning agents when encountering unlearnable noise sources.
Proposes learning progress monitoring to improve exploration efficiency in reinforcement learning agents when encountering unlearnable noise sources.
Introduces attribution gradients technique to improve citation informativeness and evidence transparency in AI answer engines.
Forecasts expert selection patterns in Mixture of Experts LLMs to optimize data movement overhead in multi-unit serving systems.
Extends Forward-Forward algorithm to reinforcement learning using action-conditioned Q-functions and layer activity statistics as learning signals.
CQA-Eval evaluation framework for multi-paragraph clinical question answering systems with physician annotations and recommendations for resource-constrained settings.
f-INE hypothesis testing framework estimates sample influence on model performance while accounting for training randomness, addressing instability in existing influence estimation methods.
MusicRFM framework adapts Recursive Feature Machines to enable fine-grained control over frozen pre-trained music generation models via internal activation steering.
Deep learning approach fixing systematic S-wave detection failures in seismic phase picking via shape-aware loss functions.
SAGA framework for source attribution of AI-generated videos. Identifies specific generative model used instead of binary real/fake detection.
Research on contrastive fusion for higher-order multimodal alignment in joint representation learning across multiple modalities.
Deep learning approach using YOLO and ResNet50 for breast cancer detection in mammograms with improved out-of-domain robustness.
IMAgent: open-source visual agent trained with end-to-end RL for multi-image reasoning tasks, addressing limitations of single-image VLM agents.
Method for dense 3D point tracking and reconstruction in dynamic scenes using single forward pass without requiring known camera poses.
Maps EU AI Act legal requirements to technical verification activities for compliance assessment of high-risk AI systems across member states.
FedVideoMAE: federated learning framework for privacy-preserving video moderation using self-supervised representations and differential privacy.
Open-source image generation model with improved reasoning for logic-intensive instruction following, closing gap to closed-source systems.
Multi-agent framework automating full computational catalysis research lifecycle from conception to publication.
Equilibrium propagation method for optimizing compound AI systems with multiple modules in long-horizon agentic workflows.
Framework using influence functions to craft training data perturbations inducing targeted model behavior changes.
Research on uncertainty quantification for ML interatomic potentials using evidential deep learning.
arXiv: Geometric analysis of transformer optimization dynamics revealing low-dimensional manifolds in grokking.
Research paper studying loss-landscape geometry as early-warning signals for grokking in neural networks.
CeRA: parameter-efficient fine-tuning method overcoming LoRA's linear capacity ceiling via non-linear gating and dropout for rank adaptation.
SafeSci: comprehensive benchmark and framework for evaluating LLM safety in scientific domains with multi-domain risk coverage and objective evaluation.
Framework for EEG-to-text decoding addressing semantic bias and signal neglect in neural signal interpretation. Published on arXiv.
Stock market prediction using Node Transformer architecture with BERT sentiment analysis to capture market patterns and dependencies.
DiFlowDubber: discrete flow matching framework for video dubbing with TTS, lip synchronization, and expressive prosody. Published on arXiv.
Qualitative study of 167,000+ AI agents on multiple platforms learning from each other and developing emergent behaviors without researcher intervention.
arXiv: RAG-enhanced diffusion models using adaptive guidance to resolve conflicts between retrieved noisy context and parametric model knowledge.
Uses unsupervised machine learning (UMAP, HDBSCAN) to analyze drift rate patterns in fast radio burst data, discovering bimodal structure in emission regions.
Studies robustness of medical vision-language models under real clinical workflows using chain-of-distribution attacks and token-space repair techniques.
ArXiv research on parameterized GELU activation for controlled ReLU approximation in deep networks.
ArXiv paper on coarse-to-fine visual processing for efficient document parsing with vision-language models.
ArXiv study on behavioral consistency of LLM agents in SWE-bench comparing multiple models.
ArXiv research analyzing prompt injection attack success stages across five frontier LLM agents.
ArXiv paper on token-level entropy regulation for reinforcement learning in large reasoning models.
ArXiv research on spectral edge thesis controlling phase transitions in neural network training dynamics.
APEX-EM non-parametric framework for LLM agents to accumulate and reuse procedural plans without weight modification.
World model planning for structured origami generation satisfying geometric constraints and kinematic rules via long-horizon reasoning.
Terminal agents executing enterprise tasks via CLI are simpler and more cost-effective than tool-augmented or web agents.
Transfer learning methods for nonparametric Bayesian networks under scarce data with constraint-based and score-based algorithms.
Body model ablation replaces SMPL with Momentum Human Rig for 3D Gaussian avatar generation with simpler architecture.
ProdCodeBench evaluates AI coding agents using production-derived tasks reflecting real developer-agent sessions and workflows.
Visual attention inertia in MLLMs causes cognitive hallucinations; proposes mitigation for compositional understanding.
Convolutional surrogate model for accelerating 3D discrete fracture-matrix simulations in groundwater flow modeling.
LiME achieves expert specialization in multimodal MoE-PEFT via lightweight modulation instead of separate adapters per expert.
SIEVE enables sample-efficient parametric learning from natural language instructions and feedback without high-quality traces.
Model scheduling for masked diffusion language models uses smaller models at early denoising steps for faster generation.
Process reward models improve LLM mathematical reasoning by providing step-level feedback on intermediate errors, not just final outcomes.
Fairness-aware GNN training using contrastive learning and counterfactual augmentation to mitigate biases from graph structure.