Show HN: Buyout Game Benchmark: Multi-Agent Bargaining, Transfers, and Takeovers
Benchmark measuring multi-agent LLM bargaining, transfers, and financial incentives across long-horizon social strategy game with eight models.
Benchmark measuring multi-agent LLM bargaining, transfers, and financial incentives across long-horizon social strategy game with eight models.
Railway CDN incident where configuration change accidentally enabled caching on disabled domains for 52 minutes.
Apple beta OS releases announcement for iOS, macOS, and other platforms.
Essay on AI agent loops: locking architecture, measuring against reality, and using AI for throughput rather than authority.
Benchmark tool using Blood on the Clocktower social deduction game to evaluate LLM reasoning, coordination, and deception abilities.
Vector quantization library implementing TurboQuant, PolarQuant, and QJL algorithms for compressing embeddings to 3-8 bits with unbiased inner products.
arXiv Labs metadata page with no article content.
Preprint on arXiv exploring mathematical methods and human thought in the context of AI and formalization for mathematics.
Git-based documentation system in Markdown with YAML frontmatter, versioned control, and web/CLI interfaces for ADRs, specs, and runbooks.
Open-source GitHub Action that evaluates pull requests for spam using multi-signal scoring to filter AI-generated content and SEO injection.
Minimal stub article about pitching ideas to voice agents.
Tool demonstrating that ChatGPT, Claude, Gemini, and Perplexity confidently provide incorrect SaaS product details without uncertainty.
GitHub removes Copilot's ability to insert pull request ads after developer backlash over unsolicited product recommendations.
Sweet CLI: open-source cheaper alternative to Claude Code and Codex using open-source models for 5-10x higher usage at lower cost.
News brief on judge halting Nexstar/Tegna media merger despite Trump administration approval.
AgentHandover is a tool that observes user workflows on Mac and generates self-improving skill playbooks for AI agents to automate tasks.
Rebyte: cloud platform for running open-source AI agent skills with one click, including web scraping and data extraction.
Quote from Kelsey Hightower about junior engineer status regarding AI, minimal content provided.
h5i: security corpus tracking real-world incidents, attack vectors, and CVEs targeting autonomous AI agents; includes Git sidecar for recording agent decisions.
Critique of Weave tool for analyzing employee AI coding usage, questions lack of methodology transparency in LLM-based evaluation metrics.
Threat modeling and authorization analysis for Model Context Protocol (MCP) systems.
Strategies for monetizing AI APIs in production environments, covering cost management and operational challenges.
Guide to moving GitHub private repos to Google Drive using cloud storage clients.
Datris.ai platform for AI-enhanced data pipeline management accessible via natural language, supports AI agents through MCP protocol.
PoliTax Split benchmark for evaluating PDF document splitting using presidential tax returns, tests LLM capabilities on complex document classification.
OpenScience.ink uses AI to summarize research papers from PubMed, simplifying dense scientific content with summaries and email delivery.
NewsMarvin aggregates AI news from 71 sources and classifies stories using Claude Haiku.
Create Context Graph is a tool for scaffolding AI agents with context graph memory.
Memoir is an open-source CLI tool providing persistent memory for AI coding tools via MCP protocol, enabling memory persistence across tool sessions.
Announcement of Tom Scott video about breaking a historical bell.
Developer replaced Firecrawl web scraping service with 2,700 lines of Elixir, including custom readability engine and bot protection evasion.
CLI tool enabling multi-agent debate between Claude, Codex, and Gemini on code and engineering questions with synthesis.
Kubernaut is an open-source AIOps platform that automates Kubernetes incident remediation using LLMs with live cluster access and kubectl commands.
Announcement of AI data center moratorium bill from Bernie Sanders and AOC.
Zeroback is an open-source realtime backend built on Cloudflare Durable Objects, providing database, APIs, and storage without vendor lock-in.
PDF document titled 'A Tinkerer's Introduction to Claude Code, by Claude Opus' but content failed to load.
Nteract 2.0 is a ground-up rebuild of a desktop notebook app for running Jupyter notebooks without a browser or server, with new runtime and architecture.
Article title mentions command injection vulnerability in OpenAI Codex, but content appears to be spam or broken page.
Staff engineer at Zopa Bank discusses handling sensitive data in LLM systems, including audit trails and data privacy considerations for production LLM deployments.
Personal blog post about automating note publishing from Obsidian to GitHub Pages.
Open-source CLI tool providing sandboxed LLM interactions with agentic coding loop capabilities similar to Claude Code.
Spring Test Profiler utility for visualizing Spring Test execution and optimizing context caching to improve build times.
MCP server for multi-instance Elasticsearch with per-instance memory and raw query execution, enabling LLM access to persistent learning.
Meta releases TRIBE v2, an AI model predicting brain responses to images, video, and language to advance neuroscience and inform AI development.
Claude Code plugin enabling integration with Codex for code reviews and task delegation within existing workflows.
Ask HN discussion about learning trade skills as career backup plan.
Open-source implementation of Paul Graham's intellectual CAPTCHA concept using math, logic, and community notes. Social media application.
Analysis of AI SRE tools entering the market, covering vendor landscape including PagerDuty, Datadog, Microsoft, and startups building AI agents for incident management.
2006 discussion of code injection bugs and buffer overflows in software security. Not AI/tech news.
Open source tool to generate mdoc(7) man pages from CLI help output using LLMs, with agent skill support for AI coding agents.