Isolater - Feed

HN megabless123 3/17/2026

Agentic Representation of Ecosystems

Conceptual project exploring AI agents representing ecosystem interests and legal rights, combining agent design with environmental protection frameworks.

HN 7777777phil 3/17/2026

Grammarly Is Facing a Class Action Lawsuit over Its AI 'Expert Review' Feature

Legal news about Grammarly lawsuit over AI feature claims. Not technical or development-focused.

HN igor_ryabenkiy 3/17/2026

Morgan Stanley projects nearly $3T in AI infrastructure investment by 2028

Market projection of $3T AI infrastructure investment by 2028. Financial forecast, minimal technical relevance.

HN gauravvij137 3/17/2026

Show HN: FC-Eval – CLI to Benchmark Local or Cloud LLMs on Function Calling

CLI benchmark for evaluating LLM function calling across 30 test cases. Supports cloud and local models for agent workflow testing.

HN yubainu 3/17/2026

Show HN: Detecting LLM hallucinations in <1ms using hidden states (RTX3050, 4GB)

Open-source tool detecting LLM hallucinations via hidden state analysis. Achieves 0.90+ ROC-AUC on Gemma/Llama with <1ms latency.

HN michael__li 3/17/2026

2X-3X productivity from parallel agents

Technique for running multiple parallel AI coding agents simultaneously using git worktrees to achieve 2-3X productivity improvement over sequential execution.

HN dorbanianas 3/17/2026

Show HN: Flock v0.7.0 – Semantic Layer for DuckDB (C++)

Flock v0.7.0: Open-source DuckDB extension enabling LLM operators and RAG pipelines natively in SQL. Adds Anthropic/multi-provider support.

HN mengram-ai 3/17/2026

Show HN: Mengram – Open-source memory layer for AI agents

Open-source memory layer for AI agents. Title only, lacks technical implementation details.

HN dirk94018 3/17/2026

Unix Reimagined for AI – Iterative Coding in a Shell Loop

Shell-based iterative coding approach for AI. Title only, insufficient detail provided.

HN marcostaira 3/17/2026

Capyra – open-source agent runtime for SAP B1 and WhatsApp

Open-source autonomous agent runtime connecting AI to business systems (ERP, databases) via WhatsApp, Slack, Telegram with action capabilities.

HN warmcat 3/17/2026

Ask HN: With Promptfoo acquired by OpenAI, what are MCP devs using for testing?

Discussion on testing tools for MCP servers after Promptfoo acquisition. MCPSpec project for CI testing of Model Context Protocol.

HN isaacrolandson 3/17/2026

Show HN: MCP Isn't Dead. You're Just Using It Wrong

Brief post title only, no substantive content about MCP (Model Context Protocol) usage patterns provided.

HN Ricky_Tsou 3/17/2026

Show HN: MUP – Interactive UI inside LLM chat, so anyone can use agentic AI

MUP (Model UI Protocol) enables interactive UI components in LLM chat, allowing both users and agents to trigger functions. Includes PoC host and 9 example implementations.

HN terranvigil 3/17/2026

VEO – Open-source content-adaptive video encoding optimizer in Go

VEO open-source video encoder optimizer using VMAF quality measurement and convex hull analysis for content-adaptive bitrate decisions.

HN kristianpaul 3/17/2026

The first open-source agentic AI physicist

Open-source AI agent designed to perform physics research tasks autonomously.

HN otti-sister 3/17/2026

Show HN: Oh-my-agent – A structural harness for AI agents in real projects

Framework for reliable AI agent development addressing hallucination and task drift. Structured protocol for production agent deployments.

HN isaacrolandson 3/17/2026

Show HN: MCP Isn't Dead. You're Just Using It Wrong

Analysis of MCP dynamic tool registration feature. Argues MCP enables advanced agent capabilities beyond static tool definitions.

HN gneray 3/17/2026

Ask HN: What AI can you use for personal video editing?

User question about AI tools for personal video editing. Discussion of limitations in current LLM video capabilities.

HN megacorp 3/17/2026

OpenAI to Cut Back on Side Projects in Push to 'Nail' Core Business

News headline on OpenAI's decision to cut side projects. Company strategy update.

HN calmkeepai 3/17/2026

LLM drift-Claude vs. Calmkeep: 25-turn Code (60% vs. 85%) & Legal (50% vs. 100%)

Performance comparison of Claude vs Calmkeep on 25-turn code and legal tasks. Shows 60%-85% code accuracy and 50%-100% legal accuracy.

HN mattbowen 3/17/2026

Three zones of LLM competence for software engineers

Analysis of LLM competence zones for software engineering tasks. Framework for understanding model capabilities and limitations.

HN vmaurin 3/17/2026

LLMs benchmark with esoteric programming languages

Benchmark study showing LLM code generation relies on memorization. Models score 90% on Python but 3.8% on esoteric languages.

HN _mrinalwadhwa_ 3/17/2026

Show HN: FreeFlow – Open-Source Wispr Flow

Open-source voice-to-text tool with real-time speech cleaning and injection into any app. Customizable alternative to Whisper Flow.

HN lumieremedia 3/17/2026

ClickSay – Chrome extension that captures UI context for AI coding tools

ClickSay is a Chrome extension that captures UI context (selectors, styles, HTML, screenshots) and voice input for AI coding tools like Claude Code.

HN danieltk76 3/17/2026

Show HN: The new security frontier for LLMs; SIEM evasion

Security research showing AI agents can perform SIEM/EDR evasion, indicating organizations must assume adversaries will gain these LLM-powered capabilities.

HN ingve 3/17/2026

How I Used Lima for an AI Coding Agent Sandbox

Experience report using Lima for sandboxing AI coding agents (Claude Code, Codex) to enable autonomous operation with controlled permissions.

BL 3/17/2026

OpenAI Japan announces Japan Teen Safety Blueprint to put teen safety first

OpenAI Japan announces safety framework for teen use of generative AI.

BL 3/17/2026

Introducing GPT-5.4 mini and nano

OpenAI releases GPT-5.4 mini and nano models optimized for coding and subagents with 2x faster inference and improved reasoning.

HN armcat 3/17/2026

Rtk – CLI proxy that reduces LLM token consumption by 60-90%

Rtk is a Rust CLI proxy reducing LLM token consumption 60-90% by filtering and compressing command outputs before context, with <10ms overhead.

HN d--b 3/17/2026

Ask HN: Quick technical questions about LLMs

Discussion thread with technical questions about LLM mechanics: token stopping, prompt continuation, and next-token prediction behavior.

HN runningmike 3/17/2026

Can LLMs Hack Enterprise Networks?

Prototype using LLMs for autonomous assumed-breach penetration testing against Active Directory networks, demonstrating LLM capabilities in enterprise security contexts.

HN elly-99 3/17/2026

A structural epistemic limit in LLMs: 8–15% unverifiable claims across domains

MarCognity-AI is an open-source framework analyzing LLM claim verification, finding 8-15% unverifiable claims. Decomposes responses and verifies against sources.

HN Anon84 3/17/2026

Out-of-Context Reasoning in LLMs: A short primer and reading list

Primer on out-of-context reasoning in LLMs: when models reach conclusions requiring reasoning not present in context window, affecting generalization and alignment.

HN leonickson 3/17/2026

Show HN: ModelSweep - Open-Source Benchmarking for Local LLMs

ModelSweep is a GUI-based benchmarking workbench for evaluating local LLMs on Ollama, enabling test suite building and comparative dashboards.

HN kesiees 3/17/2026

Llmgate – call any LLM via YAML config, 2 dependencies

Llmgate is a lightweight Python wrapper supporting 21 LLM providers via YAML config with only 2 dependencies (httpx, pyyaml).

HN ddtaylor 3/17/2026

Ad-homineLLM: Are you wrong because you used an LLM?

Philosophical critique examining ad hominem fallacies applied to LLM outputs and source credibility in argument evaluation.

HN Junnn 3/17/2026

Show HN: DataFlow,Turn raw data into high-quality LLM training datasets

DataFlow is a low-code visual pipeline tool for generating, cleaning, and preparing high-quality LLM training datasets with flexible orchestration.

Ax Mayank Mishra, Shawn Tan, Ion Stoica, Joseph Gonzalez, Tri Dao 3/17/2026

M$^2$RNN: Non-Linear RNNs with Matrix-Valued States for Scalable Language Modeling

M²RNN: non-linear RNN architecture with matrix-valued states for language modeling with greater expressive power than Transformers.

Ax Peng Xu, Zhengnan Deng, Jiayan Deng, Zonghua Gu, Shaohua Wan 3/17/2026

AerialVLA: A Vision-Language-Action Model for UAV Navigation via Minimalist End-to-End Control

AerialVLA: end-to-end vision-language-action model for UAV navigation combining visual interpretation with fuzzy linguistic instructions.

Ax Xiangyu Li, Huaizhi Tang, Xin Ding, Weijun Wang, Ting Cao, Yunxin Liu 3/17/2026

OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism

OxyGen system for unified KV cache management in vision-language-action models enabling efficient multi-task parallel inference.

Ax Xiangbo Gao, Mingyang Wu, Siyuan Yang, Jiongze Yu, Pardis Taghavi, Fangzhou Lin, Zhengzhong Tu 3/17/2026

The Pulse of Motion: Measuring Physical Frame Rate from Visual Dynamics

Analysis of temporal consistency in generative video models showing lack of reliable physical frame rate grounding.

Ax Parth Patne, Mahdi Taheri, Ali Mahani, Maksim Jenihhin, Reza Mahani, Christian Herglotz 3/17/2026

SPARQ: Spiking Early-Exit Neural Networks for Energy-Efficient Edge AI

SPARQ framework integrating spiking neural networks, quantization, and early-exit mechanisms for energy-efficient edge AI.

Ax Xiaoliang Fu, Jiaye Lin, Yangyi Fang, Chaowen Hu, Cong Qin, Zekai Shao, Binbin Zheng, Lu Pan, Ke Zeng 3/17/2026

From $\boldsymbol{\log\pi}$ to $\boldsymbol{\pi}$: Taming Divergence in Soft Clipping via Bilateral Decoupled Decay of Probability Gradient Weight

Bilateral decoupled decay method for stabilizing soft clipping in reinforcement learning with verifiable rewards for LLM reasoning.

Ax Andrew Katz 3/17/2026

Extending Minimal Pairs with Ordinal Surprisal Curves and Entropy Across Applied Domains

Extension of minimal pairs evaluation using ordinal surprisal curves to assess linguistic knowledge in LLMs beyond binary judgments.

Ax Wonbin Lee, Dongki Kim, Sung Ju Hwang 3/17/2026

ES-Merging: Biological MLLM Merging via Embedding Space Signals

Method for merging specialized biological multimodal LLMs using embedding space signals to combine modalities.

Ax Mritula Chandrasekaran, Sanket Kachole, Jarek Francik, Dimitrios Makris 3/17/2026

PGcGAN: Pathological Gait-Conditioned GAN for Human Gait Synthesis

GAN framework for synthesizing pathological gait sequences from 3D pose data for clinical analysis.

Ax Max Hellrigel-Holderbaum, Edward James Young 3/17/2026

Questionnaire Responses Do not Capture the Safety of AI Agents

Study showing questionnaire-based safety assessments of AI agents fail to capture real-world deployment safety concerns.

Ax Wen Yan, Yipei Wang, Shiqi Huang, Natasha Thorley, Mark Emberton, Vasilis Stavrinides, Yipeng Hu, Dean Barratt 3/17/2026

Deep EM with Hierarchical Latent Label Modelling for Multi-Site Prostate Lesion Segmentation

Hierarchical EM framework for prostate lesion segmentation handling label variability across multi-site clinical datasets.

Ax Yuantong Li, Lei Yuan, Zhihao Zheng, Weimiao Wu, Songbin Liu, Jeong Min Lee, Ali Selman Aydin, Shaofeng Deng, Junbo Chen, Xinyi Zhang, Hongjing Xia, Sam Fieldman, Matthew Kosko, Wei Fu, Du Zhang, Peiyu Yang, Albert Jin Chung, Xianlei Qiu, Miao Yu, Zhongwei Teng, Hao Chen, Sunny Baek, Hui Tang, Yang Lv, Renze Wang, Qifan Wang, Zhan Li, Tiantian Xu, Peng Wu, Ji Liu 3/17/2026

MBD: A Model-Based Debiasing Framework Across User, Content, and Model Dimensions

Framework for debiasing recommendation system value models across user, content, and model dimensions.

Ax Auksarapak Kietkajornrit, Jad Tarifi, Nima Asgharbeygi 3/17/2026

Distilling Reasoning Without Knowledge: A Framework for Reliable LLMs

Modular framework separating planning from retrieval in LLMs to improve reliability on factual QA with explicit tool usage.