Isolater - Feed

HN curtisblaine 2/20/2026

Ask HN: What Is the Point of WebMCP?

HN discussion questioning WebMCP use cases and value proposition for AI applications.

HN AmrDabb 2/20/2026

AI Desktop Agent over VNC – your AI connects to your desktop like a remote user

Clawd Cursor: AI desktop agent over VNC with REST API using hybrid approach—Action Router for common tasks, LLM fallback for complex operations.

HN taubek 2/20/2026

GPT 5.3 Codex wiped my F: drive with a single character escaping bug

Reddit post about alleged data loss from GPT Codex due to command escaping bug.

HN happymouse 2/20/2026

CSS Agent Garden – AI agents style one HTML page via MCP

AI agents style a single HTML page via MCP protocol, exploring CSS generation capabilities and quirks.

HN nonekme 2/20/2026

Show HN: Global Issue Memory MCP – Stack Overflow for Your Coding Assistant

Open-source MCP server enabling AI coding assistants to lookup and share error solutions across projects.

HN daniel_ward 2/20/2026

How did I revolutionized my productivity using OpenClaw

Productivity claim about OpenClaw with minimal details provided.

HN rkrizanovskis 2/20/2026

We built a desktop AI agent that runs commands locally

Desktop Commander executes tasks locally via natural language: file operations, code generation, deployment automation.

HN jaynamburi 2/20/2026

GPU Rack Power Density, 2015–2025

Analysis of GPU power density trends from 2015-2025 and thermal management challenges with latest Blackwell chips.

HN gauravvij137 2/20/2026

Show HN: CLI tool to analyze your Vector Embeddings!

CLI tool for auditing embedding spaces, built by NEO ML agent. Detects semantic inconsistencies and generates visualizations.

HN jdboyd 2/20/2026

N64 Game-Engine and Editor using libdragon and tiny3d

Game engine and editor for N64 using libdragon and tiny3d libraries without proprietary SDKs.

HN arjunbajaj 2/20/2026

Show HN: Fostrom, an IoT Cloud Platform built for developers

IoT cloud platform with device SDKs and programmable actions for fleet management.

HN matt_d 2/20/2026

Proof Assistants in the Age of AI

Formal proof assistants increasingly matter as AI generates verified mathematics; collaboration environment for humans and AI.

HN riyogarta 2/20/2026

Show HN: Syne – AI agent that remembers everything, built on PostgreSQL

Self-hosted AI agent framework with persistent semantic memory in PostgreSQL, anti-hallucination, and runtime ability creation.

HN raphaelmansuy 2/20/2026

Edgequake-litellm – Rust-backed drop-in replacement for LiteLLM (v0.1)

Rust-backed LLM provider abstraction library supporting OpenAI, Anthropic, Gemini with caching and cost tracking.

HN jerr12939 2/20/2026

Show HN: An e-ink air traffic monitor built with Cloudflare Workers

E-ink air traffic monitor built with Cloudflare Workers and custom display layouts.

HN vinhnx 2/20/2026

A Guide to Which AI to Use in the Agentic Era

Guide explaining shift in AI usage from chatbot conversations to autonomous agents that complete tasks using tools, relevant to agentic era capabilities.

HN tetubrah 2/20/2026

Show HN: Sinkai – Let AI agents hire humans for real-world tasks

Sinkai platform enables AI agents to delegate real-world tasks to humans via API, handling handoffs for on-site checks and physical verification with structured result collection.

HN Alan_Writer 2/20/2026

OpenAI and Paradigm Launches EVMbench to Test AIs on Smart Contract Security

OpenAI and Paradigm release EVMbench to evaluate AI agents' ability to detect and patch smart contract vulnerabilities across 120 vulnerability types.

HN tylersuard 2/20/2026

Agentic Internet Protocol (AIP), an agent-only web built from small text pages

Agentic Internet Protocol specification for text-based agent-only web using simplified Node structure, replacing HTML with predictable machine-readable format.

HN AmberLlama81 2/20/2026

Amazon service was taken down by AI coding bot

Financial Times paywalled article about AI coding bot disrupting Amazon service, minimal technical details provided.

HN kukla3 2/20/2026

Agentic AI and the Mythical Agent Month

Position paper examining whether AI agents can overcome Brooks' Law through scalable agency, exploring theoretical advantages of instantaneous context loading.

HN rushil_b_patel 2/20/2026

Show HN: Prompt Indexing for ChatGPT Session

Project for indexing ChatGPT sessions; minimal content available.

HN lyall 2/20/2026

Show HN: I made a static site for exploring names

Static site for exploring US Social Security baby name data with visualizations and preference-based recommendations.

Ax Renato Marcelo, Ana Rodrigues, Cristiana Palmela Pereira, Ant\'onio Figueiras, Rui Santos, Jos\'e Rui Figueira, Alexandre P Francisco, C\'atia Vaz 2/20/2026

AIdentifyAGE Ontology for Decision Support in Forensic Dental Age Assessment

Ontology framework for decision support in forensic dental age assessment for judicial and healthcare contexts involving undocumented individuals.

Ax H. Sinan Bank, Daniel R. Herber 2/20/2026

Retrieval Augmented (Knowledge Graph), and Large Language Model-Driven Design Structure Matrix (DSM) Generation of Cyber-Physical Systems

Explores using LLMs and RAG techniques to generate Design Structure Matrices for cyber-physical systems, tested on power tools and CubeSat designs.

Ax Hua Yan, Heng Tan, Yingxue Zhang, Yu Yang 2/20/2026

Mobility-Aware Cache Framework for Scalable LLM-Based Human Mobility Simulation

MobCache framework enables efficient large-scale human mobility simulation using LLMs as agents through reconstructible caches to reduce computational costs.

Ax Mubashara Akhtar, Anka Reuel, Prajna Soni, Sanchit Ahuja, Pawan Sasanka Ammanamanchi, Ruchit Rawal, Vil\'em Zouhar, Srishti Yadav, Chenxi Whitehouse, Dayeon Ki, Jennifer Mickel, Leshem Choshen, Marek \v{S}uppa, Jan Batzner, Jenny Chim, Jeba Sania, Yanan Long, Hossein A. Rahmani, Christina Knight, Yiyang Nan, Jyoutir Raj, Yu Fan, Shubham Singh, Subramanyam Sahoo, Eliya Habba, Usman Gohar, Siddhesh Pawar, Robert Scholz, Arjun Subramonian, Jingwei Ni, Mykel Kochenderfer, Sanmi Koyejo, Mrinmaya Sachan, Stella Biderman, Zeerak Talat, Avijit Ghosh, Irene Solaiman 2/20/2026

When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation

Systematic analysis of benchmark saturation across 60 LLM benchmarks, showing many quickly lose ability to differentiate best-performing models.

Ax Yonatan Gideoni, Sebastian Risi, Yarin Gal 2/20/2026

Simple Baselines are Competitive with Code Evolution

Empirical study showing simple baselines compete with code evolution techniques in mathematical bounds, agent scaffolds, and ML competitions.

Ax Zhongcan Xiao (Neutron Scattering Division, Oak Ridge National Laboratory, Oak Ridge, Tennesse USA), Leyi Zhang (Neutron Scattering Division, Oak Ridge National Laboratory, Oak Ridge, Tennesse USA, Department of Linguistics, University of Illinois Urbana-Champaign, Urbana, Illinois, USA), Guannan Zhang (Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA), Xiaoping Wang (Neutron Scattering Division, Oak Ridge National Laboratory, Oak Ridge, Tennesse USA) 2/20/2026

NeuDiff Agent: A Governed AI Workflow for Single-Crystal Neutron Crystallography

NeuDiff Agent LLM-based workflow for automated analysis and reporting in neutron crystallography at Spallation Neutron Source.

Ax Eiman Kanjo, Mustafa Aslanov 2/20/2026

Node Learning: A Framework for Adaptive, Decentralised and Collaborative Network Edge AI

Node Learning decentralized paradigm for edge AI where intelligence resides at individual nodes without centralized servers.

Ax Luis Merino, Gabriel Navarro, Carlos Salvatierra, Evangelina Santos 2/20/2026

An order-oriented approach to scoring hesitant fuzzy elements

Mathematical framework for scoring hesitant fuzzy elements using order theory, not directly related to AI/ML interests.

Ax Priyaranjan Pattnayak, Sanchari Chowdhuri 2/20/2026

IndicJR: A Judge-Free Benchmark of Jailbreak Robustness in South Asian Languages

IndicJR judge-free benchmark of jailbreak robustness across 12 Indic/South Asian languages covering 45,216 adversarial prompts.

Ax Haiyang Xu, Xi Zhang, Haowei Liu, Junyang Wang, Zhaozai Zhu, Shengjie Zhou, Xuhao Hu, Feiyu Gao, Junjie Cao, Zihua Wang, Zhiyuan Chen, Jitong Liao, Qi Zheng, Jiahui Zeng, Ze Xu, Shuai Bai, Junyang Lin, Jingren Zhou, Ming Yan 2/20/2026

Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents

GUI-Owl-1.5 native GUI agent model in multiple sizes supporting desktop, mobile, browser with state-of-the-art results on 20+ automation benchmarks.

Ax Hongwei Li, Zhun Wang, Qinrun Dai, Yuzhou Nie, Jinjun Peng, Ruitong Liu, Jingyang Zhang, Kaijie Zhu, Jingxuan He, Lun Wang, Yangruibo Ding, Yueqi Chen, Wenbo Guo, Dawn Song 2/20/2026

OpenSage: Self-programming Agent Generation Engine

OpenSage first agent development kit with self-programming capability for automatically designing agent topology, tools, and memory components.

Ax Tanqiu Jiang, Yuhui Wang, Jiacheng Liang, Ting Wang 2/20/2026

AgentLAB: Benchmarking LLM Agents against Long-Horizon Attacks

AgentLAB benchmark for evaluating LLM agent vulnerabilities to adaptive long-horizon attacks in complex multi-turn environments.

Ax Juliusz Ziomek, William Bankes, Lorenz Wolf, Shyam Sundhar Ramesh, Xiaohang Tang, Ilija Bogunovic 2/20/2026

LLM-WikiRace: Benchmarking Long-term Planning and Reasoning over Real-World Knowledge Graphs

LLM-WikiRace benchmark evaluates planning, reasoning, and world knowledge by requiring models to navigate Wikipedia hyperlinks from source to target page.

Ax Idhant Gulati, Shivam Raval 2/20/2026

Narrow fine-tuning erodes safety alignment in vision-language agents

Study showing fine-tuning vision-language agents on narrow tasks causes emergent misalignment that generalizes across unrelated domains and modalities.

Ax Justin Albrethsen, Yash Datta, Kunal Kumar, Sharath Rajasekar 2/20/2026

DeepContext: Stateful Real-Time Detection of Multi-Turn Adversarial Intent Drift in LLMs

DeepContext stateful monitoring framework for detecting adversarial intent drift across multi-turn LLM dialogues, addressing safety gaps in sequential interactions.

Ax Hexi Jin, Stephen Liu, Yuheng Li, Simran Malik, Yiying Zhang 2/20/2026

SourceBench: Can AI Answers Reference Quality Web Sources?

SourceBench evaluates quality of web sources cited by LLMs across 100 queries using eight-metric framework beyond correctness.

Ax Arnold Cartagena, Ariane Teixeira 2/20/2026

Mind the GAP: Text Safety Does Not Transfer to Tool-Call Safety in LLM Agents

GAP benchmark reveals that text-level safety alignment in LLM agents doesn't transfer to tool-call safety, measuring real-world action harms.

Ax Hejia Zhang, Zhongming Yu, Chia-Tung Ho, Haoxing Ren, Brucek Khailany, Jishen Zhao 2/20/2026

LLM4Cov: Execution-Aware Agentic Learning for High-coverage Testbench Generation

LLM4Cov framework for offline agent learning applied to high-coverage hardware testbench generation using non-differentiable execution feedback.

Ax Xinhao Deng, Jiaqing Wu, Miao Chen, Yue Xiao, Ke Xu, Qi Li 2/20/2026

Automating Agent Hijacking via Structural Template Injection

Phantom: automated agent hijacking attack on LLM agents via structural template injection, addressing OWASP-highlighted threat with improved transferability.

Ax Srikumar Nayak 2/20/2026

HQFS: Hybrid Quantum Classical Financial Security with VQC Forecasting, QUBO Annealing, and Audit-Ready Post-Quantum Signing

Quantum-classical hybrid approach to financial risk prediction combining VQC forecasting, QUBO optimization, and post-quantum cryptography.

Ax Vishal Srivastava 2/20/2026

Fundamental Limits of Black-Box Safety Evaluation: Information-Theoretic and Computational Barriers from Latent Context Conditioning

Theoretical analysis of fundamental limits in black-box safety evaluation of AI systems, showing latent context-conditioned policies create evaluation gaps.

Ax Yan Wang, Yi Han, Lingfei Qian, Yueru He, Xueqing Peng, Dongji Feng, Zhuohan Xie, Vincent Jim Zhang, Rosie Guo, Fengran Mo, Jimin Huang, Yankai Chen, Xue Liu, Jian-Yun Nie 2/20/2026

Conv-FinRe: A Conversational and Longitudinal Benchmark for Utility-Grounded Financial Recommendation

Conv-FinRe benchmark for stock recommendation that evaluates utility-grounded decisions rather than behavioral imitation in conversational finance advisory.

Ax Zhao Tan, Yiji Zhao, Shiyu Wang, Chang Xu, Yuxuan Liang, Xiping Liu, Shirui Pan, Ming Jin 2/20/2026

Sonar-TS: Search-Then-Verify Natural Language Querying for Time Series Databases

Sonar-TS neuro-symbolic framework for natural language querying time series databases, handling morphological intents and ultra-long histories.

Ax Saurav Pal 2/20/2026

Cinder: A fast and fair matchmaking system

Fair matchmaking system for multiplayer games balancing heterogeneous skill levels in lobbies.

Ax Zichen Wang, Wanli Ma, Zhenyu Ming, Gong Zhang, Kun Yuan, Zaiwen Wen 2/20/2026

M2F: Automated Formalization of Mathematical Literature at Scale

M2F agentic framework for end-to-end project-scale autoformalization of mathematics in Lean, managing cross-file dependencies and imports.

Ax Deepanjan Bhol 2/20/2026

Sales Research Agent and Sales Research Bench

AI agent for Microsoft Dynamics 365 Sales querying live CRM data, reasoning over schemas, and producing decision-ready insights with benchmarking.

Ax Shengtian Yang (Southeast University, Kuaishou Technology), Yu Li (Southeast University), Shuo He (Nanyang Technological University), Yewen Li (Kuaishou Technology), Qingpeng Cai (Kuaishou Technology), Peng Jiang (Kuaishou Technology), Lei Feng (Southeast University) 2/20/2026

Phase-Aware Mixture of Experts for Agentic Reinforcement Learning

Mixture-of-Experts architecture for RL policy networks in LLM agents, addressing simplicity bias by allocating capacity across task complexity.