Show HN: Factagora – AI agents compete on predictions, time proves who's right
Platform where AI agents make predictions on factual claims scored by temporal accuracy rather than self-reporting, using Temporal Knowledge Graph.
Platform where AI agents make predictions on factual claims scored by temporal accuracy rather than self-reporting, using Temporal Knowledge Graph.
vLLM and Nvidia achieved +38% throughput and +13% latency improvements on GPT-OSS-120B using Blackwell GPUs via FlashInfer and torch.compile optimizations.
Open-source temporary email service built with Haraka and Node.js supporting custom domains and multi-language.
Tool for adding price tag overlays to product photos at scale as alternative to Canva/Photoshop.
Video about Nvidia AI discovering mathematical patterns in reality.
Local AI agent with persistent memory using River Algorithm to store facts, timelines, and confidence levels without cloud dependency.
Founder replaced SaaS tools (Intercom, Pipedrive) with custom-built solution for student management to reduce costs.
AgentPass provides cryptographic identity and authentication infrastructure for autonomous AI agents. Identity layer enabling agent autonomy.
Relay product for efficient agent context management via ephemeral/durable classification. Token optimization for autonomous agents.
Standard for tracking human-AI creative control attribution in code generation. Developer tool for documentation and collaboration.
vLLM optimization achieving 26.2K prefill and 10.1K decode throughput on NVIDIA Blackwell for DeepSeek MoE models. Production LLM serving research.
Benchmark for evaluating LLM vision and tool-use on cursor control task based on Neuralink's Webgrid test. Measures multimodal agent capabilities.
Multi-agent autonomous workflow built Game Boy GBA emulator in 48 hours using TypeScript. Demonstrates agent orchestration and tool use.
DataClaw tool exports Claude Code conversation history to HuggingFace datasets. Data extraction tool with open source intent.
News headline about government official threatening to blacklist AI company over weapons. Policy news without technical depth.
Hobbyist built a block printing simulator using Claude Code. AI tool application but niche domain unrelated to core interests.
News article about Anthropic accusing Chinese AI firms of model distillation for IP theft. Industry news, not technical content.
Semantic parallelism optimization for efficient MoE model inference across multiple devices.
Diffusion-based recommendation system using continuous tokens with LLM integration.
Multi-modal framework combining LiDAR and camera for real-time 3D object detection in robots.
Machine learning model for traffic demand prediction with confidence intervals.
Evaluation of LLM accuracy for health advice across languages and contextual factors.
Multi-agent system using LLMs to detect vulnerabilities in hardware RTL design specifications.
Multimodal LLM combining vision, audio, and sensor data for embodied AI agents in smart homes.
Analysis of performance asymmetry in model-based reinforcement learning agents across different Atari tasks.
Machine learning approach to compile quantum circuits using diffusion models instead of traditional search algorithms.
Benchmark for evaluating multimodal LLM capabilities in humanities and social sciences requiring interdisciplinary reasoning.
Comprehensive benchmark for evaluating multimodal LLMs on front-end code generation from visual designs using modern frameworks.
Framework addressing dependency, asynchrony, and missing values in multivariate time series forecasting from real-world data.
Active view selection framework using neural uncertainty estimation for efficient 3D object reconstruction.
Machine learning interatomic potential models for predicting molecular geometries as alternative to computational chemistry methods.
LLM-based framework for evaluating children's language function through phonetic transcription and automated speech assessment.
Theoretical analysis of multinomial logistic bandit problem with minimax-optimal algorithm design.
Characterization and comparison of State Space Models and hybrid architectures versus Transformers for long-context processing on edge devices.
Political language BERT model for analyzing political debates and discourse with domain-specific fine-tuning.
CNN-based approach for infrared small target detection and segmentation focusing on noise suppression.
Monte Carlo tree diffusion with multiple experts for protein sequence design combining masked diffusion models with tree search.
Framework using LLMs as spatio-temporal predictors with hierarchical temporal tokenization for human mobility and trajectory prediction.
Reinforcement learning fine-tuning approach using polychromic objectives to prevent policy collapse and maintain behavioral diversity.
Learnable dynamic routing mechanism for mixture-of-experts with LoRA adapters enabling efficient LLM task adaptation without fixed expert assignment.
Policy optimization method for text-to-image models addressing credit assignment instability in reinforcement learning fine-tuning.
Evaluation of Vision-Language-Action model robustness against multi-modal perturbations across 17 adversarial conditions.
LLM-based agent framework for recommendation systems that leverages commonsense reasoning to capture item relationships and user intent.
Deep learning approach for image transmission over noisy channels using semantic clustering via hash distillation.
Study on how LLM-generated rationales influence human plausibility judgments in commonsense reasoning tasks using 3,000 human and 13,600 LLM judgments.
Satellite-based chlorophyll-a monitoring for lagoon eutrophication using Sentinel 2 imagery and machine learning.
Latent-Augmented Discrete Diffusion model improving fast language generation by modeling cross-token dependencies.
Data generation framework for bimanual mobile robot manipulation using imitation learning without extensive manual demonstrations.
Method for scalable AI oversight via partitioned human supervision across multiple domain experts for complex multi-domain tasks.
Security evaluation framework for LLM backbone models in AI agents, addressing vulnerabilities unique to agent architectures.