Framework for automated skill acquisition in modular AI agents through mining open-source repositories to extract procedural knowledge and specialized expertise.
Framework for unlearning relational safety failures in multimodal LLMs where combinations of benign concepts become unsafe when linked by specific relations.
Gradient Atoms: unsupervised method for discovering and attributing model behaviors via sparse decomposition of training gradients without requiring predefined queries.
OpenHospital: interactive benchmark arena for evolving and evaluating LLM-based collective intelligence systems using physician and patient agents.
Theoretical analysis arguing that LLM's most valuable capabilities are the unexplainable components that cannot be captured by discrete rule systems.
SAGE framework: multi-agent reinforcement learning system for improving LLM reasoning without large human-labeled datasets, using self-play and closed-loop feedback.
Deep learning framework for handling geometrically distorted images in computer vision tasks using deformation-invariant neural networks.
LLAMAFUZZ uses LLMs to enhance greybox fuzzing for structured data, improving mutation strategies beyond random approaches.
ControlCity diffusion model generates urban morphology by fusing multimodal data and geographical context using semantic-aware synthesis.
TS-Reasoner agent integrates LLM reasoning with domain-specific code for multi-step time series inference and automated analysis tasks.
Physics-informed deep learning emulator for Earth system temperature modeling that reduces computational cost of climate simulations.
Data engine using scene graphs to systematically generate synthetic training data for improved compositional generalization in visual generation models.
Systematic literature review of LLM security benefits and drawbacks in code generation, vulnerability detection, and remediation tasks.
Interdisciplinary study on copyright law implications of training generative AI via web scraping, covering fair use and TDM exceptions.
Mathematical research exploring unifying formulas and theories connecting different representations of constants like π.
MASS method merges multiple fine-tuned models via adaptive subspace selection, improving accuracy over existing merging approaches without retraining.
Study on minor-embedding problem in quantum annealing for mapping Ising models to quantum processors and performance optimization.
Research shows diverse AI personas in generative AI reduce homogenization in collaborative creative outputs compared to single-persona systems.
NeuroSim V1.5 benchmarking software for analog computing-in-memory accelerators with non-ideality simulation for AI hardware efficiency.
CRBench: Real-world benchmark for text-to-chart retrieval using synthesized semantic insights.
FALCON: Method addressing false negatives in vision-language pretraining through contrastive learning.
Systematic literature review of explanation user interfaces for interpretable AI systems.
BiomedSQL: Text-to-SQL benchmark for scientific reasoning over biomedical databases requiring domain knowledge.
VERINA benchmark for evaluating LLM code generation with joint code, specification, and proof generation.
Robust aggregation method for distributed learning systems defending against Byzantine attacks.
Differential privacy techniques for LLMs applied to radiology report classification tasks.
Benchmark evaluation of LLM effectiveness for text diacritization in Arabic and Yoruba with MultiDiac dataset.
Structured instruction approach to improve chart-to-code generation in multimodal LLMs with iterative refinement.
VideoITG: Frame selection method for efficient video understanding in video-LLMs using instructed temporal grounding.
Systematic study of LLM capabilities for discrete choice modeling with analysis of prompting strategies.
Method for detecting LLM confabulations using token-level uncertainty estimation for reliability in agentic applications.
Dynamic weighting approach integrating supervised fine-tuning and reinforcement learning for LLM post-training alignment.
Vision-based learning framework for omnidirectional bipedal locomotion using depth images on challenging terrain.
CardioComposer: Generative model for 3D cardiovascular anatomy synthesis using differentiable geometry and diffusion.
CodeGym: Reinforcement learning framework for training LLM agents to use tools generalizing across new tasks and workflows.
LANCE: Low-rank compression technique for reducing activation memory in on-device continual learning.
ERGO: Two-stage coarse-to-fine pipeline for efficient high-resolution image processing in vision-language models.
Ontological framework for robots to explain competing plans through contrastive reasoning for human-robot interaction.
Co-denoising framework for transferring manipulation skills from human videos to robots by bridging morphological differences.
Exploration of neural network capacity to model aesthetic judgment and represent beauty across diverse object types.
Security research identifying backdoor vulnerabilities in AI agent supply chains through data poisoning across multiple pipeline stages.
Technique to prevent search-based linkage attacks on de-identified documents using perturbation methods.
Knowledge distillation method balancing exploration and guidance for training efficient small language models with reduced exposure bias.
Learning framework enabling robots to learn from constrained human demonstrations where demonstrators have fewer degrees of freedom.
Inference-time tree search guidance method for controllable graph generation using diffusion models across knowledge graphs and drug discovery.
Empirical study showing frontier LLMs trained on copyrighted books produce preferred literary text matching award-winning authors' styles.
Benchmark for open-vocabulary underwater instance segmentation with geometric and semantic alignment enhancements.
On-device object-goal navigation agent system using small LLMs with navigation map caching for zero-shot embodied AI.
Transformer-based model extending Prior-Data Fitted Networks for Bayesian clustering with uncertainty quantification on synthetic data.
Self-evolution framework for LLMs using ontology rules to reduce hallucinations in specialized domains like healthcare and law.