Hyperagents – Self-referential self-improving agents
Self-referential AI agents that optimize themselves for arbitrary computable tasks with experiment logging and code generation.
Self-referential AI agents that optimize themselves for arbitrary computable tasks with experiment logging and code generation.
Shared household task planner separating undated and dated tasks for household management.
Chronos is a drag-and-drop task scheduler with UI-first design, built with AI assistance.
Local Windows application for natural language photo search without cloud uploads, processing terabytes on-device.
Vesper is an MCP-native tool enabling AI agents to autonomously search, download, clean, and export datasets from Kaggle, HuggingFace, and OpenML without human intervention.
IBM Bob is an AI development partner integrated into codebases with customizable modes for code assistance and quality improvements.
ToolTrust Scanner detects security vulnerabilities in MCP packages, including prompt injection and supply chain exploits, before AI agents execute them.
Pixelbeat is a Rust-based terminal music player daemon with pixel-art UI for coding sessions.
Discussion on giving MCP servers code execution capability to handle complex data processing tasks like analyzing high-frequency wearable sensor streams without manual aggregation.
Proposal for an open protocol called CRP for cognitive perception in AI systems with pattern recognition capabilities.
Personal experience using AI as a design engineering tool for experimentation and iterative refinement in creative work.
Framework for studying how LLM-based agents form stable stances and identities in multiagent communities using virtual ethnography methods.
Bilevel autoresearch applying automated research loops to optimize autoresearch systems themselves, improving bottlenecks iteratively.
Using LLMs to detect microservice architecture patterns from Infrastructure-as-Code artifacts for documentation purposes.
Analysis of 1.8M Hugging Face models tracking how multimodal capabilities emerge and propagate across open LLM families.
Systematic evaluation of four prompting strategies across GPT models on chart-based question answering, isolating prompt structure effects.
MERIT system combining memory and retrieval mechanisms with LLMs for interpretable knowledge tracing in educational settings.
Safety-constrained offline reinforcement learning using reachability analysis for sequential decision making in real-world applications.
TIPS framework for training search-augmented LLMs with reinforcement learning, improving credit assignment and reward shaping for question answering tasks.
Methods for generating high-quality synthetic training data using LLMs to fine-tune smaller models, analyzing diversity and distribution in embedding space.
Mechanistic interpretability study investigating whether LLMs develop genuine emotional representations or merely detect emotion keywords through circuit analysis.
Compact uncertainty estimation method for LLMs scoring cross-layer agreement patterns in internal representations via single forward pass.
Sparse Feature Attention method reducing transformer self-attention cost via k-sparse feature representations instead of sequence-level sparsity.
Mathematical framework interpreting LLM hidden states as points on latent semantic manifolds with Riemannian geometry and Voronoi partitions.
Training-free hallucination detector for LLMs using sample transform cost to measure output distribution complexity without fine-tuning.
Chinese financial news dataset and benchmark for evaluating LLM-based agents in macro and sector asset allocation decision-making.
Conditional flow-matching framework using diffusion Transformers to unify learning of PDE solution operators across varying dimensionality.
CNN-LSTM framework with attention and focal loss for detecting falls in elderly individuals from multimodal sensor data.
AI-based tropical cyclone track and intensity forecasting with systematic bias correction on weather data.
Decision Transformer approach for optimizing emergency vehicle signal preemption using offline, return-conditioned sequence modeling.
Diffusion model for generating synchronized group dance choreography from music with spatial coordination for film/gaming.
Geometric Mixture-of-Experts framework for graph representation learning using curvature-guided routing on heterogeneous topologies.
Physics-informed Schrödinger bridge approach for data assimilation from sparse observations in PDE-governed systems.
Dataset aligning instruction manuals with assembly videos for evaluating multimodal LLMs on real-world technical tasks.
AEGIS infrastructure for governance of adaptive medical AI systems under FDA and EU regulations with continuous improvement.
Multi-task deep learning framework for predicting lithium-ion battery state-of-health and remaining useful life.
Delta-Aware Quantization framework for post-training LLM compression that preserves knowledge from alignment fine-tuning.
Classification approach for wind power ramp event forecasting under severe class imbalance for grid stability.
AgentSLR uses agentic AI to automate systematic literature reviews in epidemiology from retrieval through synthesis.
Method for adding trained persistent memory to frozen decoder-only LLMs without cross-attention mechanisms.
Applies conformal prediction for formal safety guarantees in wildfire spread prediction using tabular, spatial, and graph models.
Comprehensive study of LLM-based data imputation across multiple models and datasets, analyzing hallucination effects and control mechanisms.
Combines graph signal processing with Mamba2 state-space models to create adaptive filter banks for language modeling.
Causal Direct Preference Optimization method for training LLMs to generate recommendations while mitigating spurious correlations.
Graph RAG framework combining labeled property graphs and RDF for retrieval-augmented generation over structured and semi-structured data.
T-MAP uses evolutionary search to red-team LLM agents by exploiting multi-step tool execution vulnerabilities in MCP ecosystems.
Analysis of feature importance bias in gradient boosting models under multicollinearity, affecting SHAP-based explanations.
WIST framework uses web-grounded iterative self-play with reinforcement learning to improve LLM reasoning in specific domains.
Review of neuroscience and language technologies for aphasia rehabilitation using personalized, culturally sensitive AI tools.
Theoretical analysis of full-waveform inversion using neural tangent kernel framework for geophysical and medical imaging.