Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry
Interpretability study of DINOv2 vision model using sparse autoencoders to analyze task-relevant concept representations.
Interpretability study of DINOv2 vision model using sparse autoencoders to analyze task-relevant concept representations.
Backdoor attack on vision-language-action models demonstrating vulnerability to behavioral hijacking via hidden training triggers.
Dynamic Gaussian Splatting improvement for monocular 4D scene reconstruction using uncertainty quantification.
Bayesian optimization method using LLM fine-tuning to perform Thompson sampling in large discrete spaces without gradient computation.
Offline-to-online reinforcement learning framework with adversarial fine-tuning for robust robot control under action perturbations.
Quaternion-valued Hopfield-structured neural network with supervised learning rules for continuous-time dynamical systems.
Training-free framework for improving vision-language models on information-dense images with text and graphical elements.
Study of continual pre-training for adapting LLMs to low-resource French dialects under tight compute and data constraints.
Investigation of in-context learning across transformer, state-space, and hybrid LLM architectures using behavioral and intervention methods.
Study of misconceptions novice programmers have about LLM-based coding assistants, examining impact of tool capabilities and extensions.
Training method combining supervised learning and reinforcement learning to improve multi-step reasoning in open-source LLMs.
Agentic multimodal model framework enabling tool invocation (code execution, web search) and reasoning integration for vision-language tasks.
Analysis of how LLMs shift moral judgments under persona role-play, introducing benchmark metrics for moral susceptibility and robustness.
Open benchmark for deep learning-based event reconstruction in neutrino telescope data using inverse problem solving.
Diffusion language model using Mamba backbone for efficient inference, achieving higher throughput than transformer-based alternatives.
Benchmark for evaluating embodied AI agents on interaction with physical interfaces (switches, panels, GUIs) in complex environments.
Novel sampling framework for flow matching models using importance weighting to improve variance reduction in expectation estimation.
Machine learning system for automating data quality monitoring and anomaly detection in particle physics collider experiments.
Study comparing CNN pretraining strategies (general vs domain-specific) for brain tumor classification in MRI images with limited data.
SocialNav foundation model for socially-aware embodied navigation with hierarchical architecture trained on 7M samples for human-compliant trajectory generation.
Heterogeneous multi-agent reinforcement learning with attention mechanism for automated feature transformation on structured data.
VCWorld biological world model for virtual cell simulation predicting cellular responses to perturbations using multi-source biological information.
QKAN-LSTM combining quantum-inspired Kolmogorov-Arnold networks with LSTM for improved sequential modeling with reduced parameter redundancy.
WisPaper end-to-end agent system for academic literature discovery and organization combining semantic search verification with workflow integration.
FRIEDA benchmark evaluating vision-language models on multi-step cartographic reasoning with map interpretation for disaster response and urban planning.
Rough sets methodology for explaining spectral graph clustering results on text documents with handling of documents without clear content meaning.
Generalized Primal Averaging optimizer extending Nesterov's method for faster LLM training, unifying DiLoCo and schedule-free approaches with reduced memory requirements.
Trust region masking technique for LLM reinforcement learning addressing off-policy mismatch and approximation errors from implementation divergences in policy gradient optimization.
LIA supervised fine-tuning approach using LLMs for automatic software issue assignment in large open-source projects without heavy project-specific training data.
VISTA method for repairing incomplete maritime vessel trajectory data with repair provenance documentation for safety-critical applications.
CSyMR benchmark for compositional music information retrieval testing LLMs on multi-step reasoning over symbolic music scores and natural language queries.
GenAI-Net generative framework for automated design of chemical reaction networks implementing desired dynamical functions in synthetic biology applications.
DUET method for LLM unlearning via distillation from a contextualized teacher, removing undesirable knowledge without retraining while avoiding catastrophic forgetting.
LEC-KG framework combining LLM semantic understanding with knowledge graph embeddings for automated domain-specific knowledge graph construction from unstructured text.
Reinforcement learning approach for training humanoid whole-body controllers that generalize across diverse robot embodiments with varied dynamics and degrees of freedom.
Empirical study of 32K LLM agents on Chirper.ai social media platform analyzing collective behaviors, biases, and exclusionary dynamics across 7M posts.
Multimodal diffusion transformer for robotic bimanual dexterous manipulation integrating vision, proprioception, and tactile signals.
Study on how LLM-generated personalized messages in behavior-change systems affect user perceptions through exposure patterns rather than individual message quality.
Risk-sensitive evaluation framework for LLM hallucinations in medical advice, assessing clinical harm severity beyond factual correctness.
Automated black-box pipeline detecting unverbalized biases in LLM chain-of-thought reasoning without predefined categories using task-specific evaluation.
RooflineBench: Roofline model-based benchmarking framework for characterizing performance of Small Language Models on edge hardware.
Agentic system for respiratory disease diagnosis using multimodal sound generation and active adversarial curriculum learning.
Framework extending Item Response Theory to measure AI model propensities and behavioral tendencies beyond capability metrics.
Agentic system for hierarchical urban geospatial modification using multimodal models to handle dependency-aware city planning changes.
Heterogeneous graph transformer framework for predicting high-potential small-medium enterprises using public business data.
Analysis showing test-time training with KV binding can be expressed as learned linear attention mechanism.
Federated learning aggregation method using gradient-based weighting to address client drift and data heterogeneity.
Safety filtering framework for flow-based generative models providing formal guarantees that generated samples satisfy hard constraints.
Training approach for collaborative AI agents using strategic risk aversion to improve generalization when paired with new partners.
Method for reconstructing video content from brain fMRI activity using hierarchical semantic guidance.