Security research on backdoor attacks in AI agent supply chains through poisoned interaction data collection, formalizing threat models for finetuned web browsing and tool-use agents.
Robotics research on learning from constrained demonstrations where expert interfaces limit optimal behavior demonstration, using techniques like kinesthetic teaching and sim-to-real transfer.
Game-theoretic model analyzing bias in meritocratic selection systems like admissions and hiring, examining how AI shapes perceived candidate value across socioeconomic groups.
Training-free diffusion model enabling layer-wise control in text-to-image generation through noise transplantation without fine-tuning or large datasets.
Deep Q-Network learns satellite weighting for CSI-free multi-satellite positioning in LEO constellations combined with weighted least squares estimation.
Joint audio-visual editing pipeline with video-to-audio generation model conditioning on source audio, target video, and text prompts for coherent edits.
Topological data analysis patch-based approach for CT imaging feature extraction improving ML model performance on medical diagnosis tasks.
Noise-aware masked autoencoder for self-supervised SAR satellite imagery representation learning addressing data scarcity and speckle noise challenges.
FusionRoute enables token-level collaboration between specialized and general-purpose LLMs via dynamic routing, improving efficiency and domain performance.
Language-aligned concept foundation model decomposing vision representations into human-interpretable concepts with spatial grounding across diverse tasks.
VisTIRA addresses vision-language model performance gap on visual math problems through structured tool integration and dense formula/layout handling.
Linear probes on LLM pre-generation activations predict task success before inference, enabling efficient routing and reducing extended reasoning compute costs.
Property-preserving kernel-based operator learning method for incompressible flow simulation respecting incompressibility and physical constraints.
Addresses cross-agent noise in multi-agent reinforcement learning through descent-guided policy gradients, reducing sample complexity from O(N/ε) to improved bounds.
Proposes Model Medicine framework for understanding, diagnosing, and treating disorders in AI models using clinical methodology analogies for interpretability.
Evaluates vision foundation models adapted for pasture biomass regression, comparing SSMs, transformers, and simpler cross-view modules on agricultural imagery.
Neural operator approach for modeling tau protein transport in Alzheimer's disease using mechanistic brain models and connectome data.
Survey introducing reinforcement learning methods to economists for solving high-dimensional dynamic programming problems that resist dimensionality reduction.
Framework defining AI models and AI systems through systematic review of 896 academic papers and 80+ regulatory documents to resolve boundary problem in AI regulation.
Red-teaming study on adversarial activation steering techniques to compromise LLM responses, examining safety risks from semantic layer manipulation.
Evaluates autonomous cyber-attack capabilities of frontier AI models on multi-step attack scenarios, comparing seven models over 18 months at varying inference compute budgets.
Gradient flow utility metric for structural pruning and dynamic routing in deep networks, addressing magnitude bias in weight-based heuristics.
Foundation models as surrogates for active learning in materials discovery, reducing costly synthesis cycles through better uncertainty estimation.
Empirical comparison of CNNs, contrastive VLMs, and generative VLMs for crop disease classification across diverse conditions.
SHAMISA framework for no-reference image quality assessment using self-supervised learning on unlabeled distorted images.
LUMINA mammography benchmark dataset with 1824 multi-vendor FFDM images and energy/vendor metadata for medical imaging research.
Data augmentation method for ring-type polygon annotations in floorplan analysis preserving topology during geometric transformations.
Self-supervised learning framework using masked BRep autoencoder with hierarchical graph transformer for CAD model representation learning.
Research analyzing error sources in global feature effects (PD/ALE plots) for black-box model interpretation.
HindSight framework evaluates LLM-generated research ideas using time-split evaluation against future publications and citation impact.
GSD 2 is a standalone CLI coding agent built on Pi SDK, evolving from a Claude prompt framework to a full agent with session and context control.
MoltGuard runtime guardrails tool blocks dangerous AI agent tool calls before execution. Open source with 16K+ downloads for preventing credential leaks and database deletion.
Platform where 40+ AI agents autonomously share ideas, review projects, provide feedback, and debate without human curation.
Discussion prompt about IDE and programming workflows post-AI.
Personal narrative about discovering security through game modding, connecting to modern reverse engineering and AI applications.
LiTo proposes a 3D latent representation jointly modeling object geometry and view-dependent appearance from RGB-depth images.
Security vulnerability scanner detecting patterns systematically introduced by AI code generation tools: SQL injection, hardcoded secrets, XSS, hallucinated packages.
Open source governance framework for AI agents in delivery pipelines, addressing gaps in CI/CD models for agent-driven software development.
Vision model hallucinated entire grocery receipt contents rather than reading existing receipt, demonstrating systematic fabrication risk in vision-language models.
WriterAgent (Cursor for LibreOffice) development progress adding MCP protocol, research sub-agents, voice interface, and evaluation dashboard for office document automation.
Open source tool for sharing context with AI agents (Claude, ChatGPT) via spatially-organized llms.txt files, working beyond context window limits.
Model Context Protocol documentation tool providing local-first, indexed documentation packages for 100+ frameworks. Sub-10ms query latency for AI agent assistants.
Usercall MCP tool enables AI agents to conduct user interviews via voice calls, returning structured insights with themes and quotes.
Distributed multi-agent cluster using local LLMs (DeepSeek-R1, Qwen) on single GPU server to reduce cloud API costs.
Nvidia announces space computing platforms for orbital data centers with AI acceleration for geospatial and autonomous operations.
Forum discussion about AI tool usage as job requirement.
Zalor feature for testing AI agents with custom CSV datasets and automated test case generation from edge cases.
AI agents applied to March Madness bracket predictions with performance metrics.
TypeScript wrapper simplifying MCP server connections from 30+ lines to 2 lines. Supports HTTP and stdio transports with auto-detection.
MIT-licensed digital twin of AWS providing local replica responding to real AWS API calls. Built entirely with AI, supports 147 services, designed for agent testing.