HN Greenpants 3/24/2026

AI Etiquette

Essay on establishing ethical guidelines and boundaries for AI tool usage in development and data handling.

HN jordan_gibbs 3/23/2026

Secret Hitler LLM Benchmark

LLM benchmark using 8-player Secret Hitler game to evaluate language models' deception and reasoning abilities across multiple AI agents.

HN HR01 3/23/2026

Why LLMs can't paragraph well

Analysis of why language models struggle with paragraph structure and coherence in writing. Examines technical aspects of LLM text generation limitations.

HN indigodaddy 3/23/2026

Welcome to the AI agent arms race

Report on emerging AI agent race with Anthropic, Nvidia, Perplexity developing autonomous agents for business tasks. Discusses productivity gains and risks.

HN adamrezich 3/23/2026

Pony Gets a Template Engine

Pony language gains template engine for web development, supporting conditionals and loops with Mustache/Jinja-like syntax.

HN EvgeniyZh 3/23/2026

Vibe physics: The AI grad student

Harvard physics professor supervised Claude AI through real quantum field theory research calculation end-to-end without touching files. Reports on capabilities and limitations.

HN jangletown 3/23/2026

Show HN: Evals Skills

LangWatch introduces ready-to-use eval skills and prompts to streamline LLM application onboarding, reducing setup time from hours to minutes without requiring manual instrumentation.

HN zone411 3/23/2026

Show HN: LLM Debate Benchmark

Benchmark measuring LLM performance in multi-turn adversarial debates across propositions, evaluating knowledge retention, factual accuracy, and argumentation under pressure.

HN py4 3/23/2026

The Priesthood of System Design

Essay arguing coding agents will eventually handle system design, contrary to common belief that system design is uniquely human expertise.