RieMind: Geometry-Grounded Spatial Agent for Scene Understanding
Spatial reasoning agent decoupling perception from reasoning in visual language models for improved metric and geometric scene understanding.
Spatial reasoning agent decoupling perception from reasoning in visual language models for improved metric and geometric scene understanding.
Bi-level optimization approach using Stackelberg game theory for coupled morphology-control co-design in embodied agents.
Adversarial patch framework for evasion and impersonation attacks against facial re-identification systems across non-overlapping cameras.
Safety defense mechanism for LLMs monitoring intermediate reasoning steps in chain-of-thought to prevent jailbreak attacks.
Benchmark evaluating marginal utility of agent skills for LLM-based software engineering agents on real GitHub issues and requirements.
Empirical study of 16 LLMs examining internal mechanisms for table understanding across attention dynamics, layer depth, and expert activation.
Video object detection method using adaptive residual context for identifying autonomous shuttles in urban traffic monitoring.
Comprehensive safety evaluation and monitoring framework for LLM-based multi-agent systems addressing novel risks beyond single agents.
Framework for improving robustness of quantized DNNs through three-stage fine-tuning addressing both fault and attack resilience.
Analysis of safety vulnerabilities in test-time training methods for LLMs, examining susceptibility to prompt injection and adversarial attacks.
Vision-language critic model leveraging pre-trained VLAs for multi-agent reinforcement learning value estimation with improved generalization.
Memory management framework for small language model agents using adaptive clustering to organize experiences and prevent knowledge corruption.
Fine-tuning strategies for PDE foundation models using physics-informed training to adapt to new tasks with limited domain-specific data.
Comparative study of classical ML and deep learning for music genre classification, focusing on underrepresented Nepali music traditions.
Framework evaluating AI agent vulnerabilities by applying malware analysis concepts to test-time agent behavior and adversarial robustness.
Benchmark challenge for robotic collaborative manipulation and assembly tasks in industrial automation settings.
Knowledge distillation method for tabular models that addresses feature interactions without original training data, enabling privacy-preserving model compression.
RSGen: Framework for layout-driven remote sensing image generation using diffusion models with edge guidance.
Multi-agentic workflow deploying AI agents with automated instruments to recover critical materials via selective precipitation.
Analyzes grokking phenomenon in neural networks through spectral gating mechanism and optimizer noise interaction.
Argues for taxonomy-specific evaluation in time-series forecasting to accurately assess ML progress versus classical methods.
SlovKE: Dataset and LLM evaluation for keyphrase extraction in Slovak, a morphologically rich low-resource language.
Proposes error estimation method for Physics-informed neural networks using finite difference post-hoc validation.
DOT: Automated database tuning system using dynamic knob selection and online sampling to optimize DBMS performance.
InterveneBench: Benchmark evaluating LLMs on causal inference and intervention reasoning in realistic social science scenarios.
Studies how LLMs model student misconceptions when generating multiple-choice distractors, analyzing reasoning strategies.
PokeAgent Challenge: Large-scale benchmark for competitive multi-agent decision-making with partial observability and long-horizon planning.
Lore: Protocol using structured Git commit messages to preserve decision context and institutional knowledge for AI coding agents.
Physics-informed neural networks and neural operators for simulating EUV electromagnetic wave diffraction in lithography.
PRIMO R1: Framework using reinforcement learning to improve multimodal models for process reasoning in robotic manipulation.
Analyzes moral indifference in LLMs due to compressed moral concepts and proposes remedial techniques.
Mixture-of-Depths Attention: Mechanism addressing signal degradation in deep LLMs by enabling attention to multiple depth levels.
Combines tree-search, generative models, and Nash bargaining for opponent modeling in game-theoretic reinforcement learning.
Review of deep learning methods for photoplethysmography signal analysis in clinical and wearable applications.
FAIRGAME: Framework using game theory to detect and recognize bias in multi-agent AI systems.
Method to reduce reasoning path length in large reasoning models like o1 and R1 using reward designs in reinforcement learning.
AssetOpsBench: A benchmark framework for evaluating LLM agents on industrial asset operations tasks like condition monitoring and maintenance scheduling.
Framework for AI alignment grounded in resource-rational contractualism, enabling diverse stakeholders to reach agreements on AI decision-making.
Machine learning approach to automate story point estimation for software sprint planning using comparative learning from historical team decisions.
Vision-language model for chart reasoning using chain-of-thought supervision and reinforcement learning to improve numerical comprehension and multi-level visual understanding.
Dynamic retrieval-augmented generation system for visual question answering that retrieves from both text and images to handle complex multimodal queries.
Multimodal reinforcement learning approach for chart-to-code generation that combines structured output requirements with visual reasoning on information-rich images.
Community paper from NSF workshop on AI's role in mathematical and physical sciences, discussing opportunities across astronomy, chemistry, materials, math, and physics.
Data augmentation framework for vision-language-action models in robot manipulation using generative visual transfer to reduce annotation costs.
Framework using LLMs as zero-shot reasoning engines to automate hyperparameter configuration for metaheuristic algorithms without training.
Method to detect and mitigate unproductive reasoning in large reasoning models by identifying early signals predicting capability boundary violations.
Agentic framework for automated scientific discovery that iteratively explores unknown systems through experiments and analysis without domain-specific tailoring.
Framework for learning abstract world models that jointly represent symbolic states and causal processes for endogenous and exogenous dynamics in robot planning.
Framework converting internet videos of human computer use into training data for computer-using AI agents via UI trajectory extraction.
Geospatial foundation models for Earth observation using POI-guided contrastive learning to improve human-centered representations.