Isolater - Feed

Ax Yash Jangir, Yidi Zhang, Pang-Chi Lo, Kashu Yamazaki, Chenyu Zhang, Kuan-Hsun Tu, Tsung-Wei Ke, Lei Ke, Yonatan Bisk, Katerina Fragkiadaki 3/16/2026

RobotArena $\infty$: Scalable Robot Benchmarking via Real-to-Sim Translation

RobotArena ∞: scalable robot benchmarking via real-to-sim translation. Enables rigorous evaluation of robot policies across diverse tasks and environments.

Ax Roy Rinberg, Adam Karvonen, Alexander Hoover, Daniel Reuter, Keri Warr 3/16/2026

Verifying LLM Inference to Detect Model Weight Exfiltration

Verifying LLM inference to detect model weight exfiltration via steganography. Defends inference servers against model theft and anomalous behavior.

Ax Zeyun Hu, Yang Liu 3/16/2026

Stochastic Dominance Constrained Optimization with S-shaped Utilities: Poor-Performance-Region Algorithm and Neural Network

Portfolio optimization under stochastic dominance constraints with S-shaped utilities. Investigates first and second-order dominance constraints.

Ax Anees Ur Rehman Hashmi, Numan Saeed, Christoph Lippert 3/16/2026

AnatomiX, an Anatomy-Aware Grounded Multimodal Large Language Model for Chest X-Ray Interpretation

AnatomiX: anatomy-aware multimodal LLM for chest X-ray interpretation. Improves spatial reasoning and anatomical understanding in medical imaging.

Ax Shadeeb Hossain 3/16/2026

Prediction of Cellular Malignancy Using Electrical Impedance Signatures and Supervised Machine Learning

Study of bioelectrical properties for malignancy detection. Systematic review of 535 datasets on cellular bioelectric parameters across frequencies.

Ax Numan Halit Guldemir, Oluwafemi Olukoya, Jes\'us Mart\'inez-del-Rinc\'on 3/16/2026

FARM: Few-shot Adaptive Malware Family Classification under Concept Drift

FARM framework for malware family classification under concept drift. Uses triplet autoencoder for few-shot adaptation to covariate and label drift.

Ax Xinwu Ye, Yicheng Mao, Jia Zhang, Yimeng Liu, Li Hao, Fang Wu, Zhiwei Li, Yuxuan Liao, Zehong Wang, Yingcheng Wu, Zhiyuan Liu, Zhenfei Yin, Li Yuan, Philip Torr, Huan Sun, Xiangxiang Zeng, Mengdi Wang, Le Cong, Shenghua Gao, Xiangru Tang 3/16/2026

LatentChem: From Textual CoT to Latent Thinking in Chemical Reasoning

LatentChem: latent reasoning interface for chemical LLMs. Decouples chemical computation from discrete tokens to improve efficiency and performance in chemical reasoning.

Ax Antonin Sulc 3/16/2026

FastLSQ: Solving PDEs in One Shot via Fourier Features with Exact Analytical Derivatives

FastLSQ framework for solving PDEs using Fourier features with analytical derivatives. Achieves high accuracy on 1-6D problems without autodiff.

Ax Arindam Khaled 3/16/2026

Pyramid MoA: A Probabilistic Framework for Cost-Optimized Anytime Inference

Pyramid MoA: probabilistic framework for cost-optimized LLM inference via cascading and routing. Balances inference cost and reasoning capability for large language models.

Ax Natalia da Silva, Dianne Cook, Eun-Kyung Lee 3/16/2026

An Enhanced Projection Pursuit Tree Classifier with Visual Methods for Assessing Algorithmic Improvements

Enhancements to projection pursuit tree classifier with visual diagnostic methods for high-dimensional classification. Addresses limitations in multi-class settings.

Ax Markus Knauer, Samuel Bustamante, Thomas Eiband, Alin Albu-Sch\"affer, Freek Stulp, Jo\~ao Silv\'erio 3/16/2026

IROSA: Interactive Robot Skill Adaptation using Natural Language

IROSA: framework combining foundation models with imitation learning for robot skill adaptation via natural language. LLM application to robotics.

Ax Jinman Wu, Yi Xie, Shen Lin, Shiqian Zhao, Xiaofeng Chen 3/16/2026

Knowing without Acting: The Disentangled Geometry of Safety Mechanisms in Large Language Models

Disentangled Safety Hypothesis: mechanistic study of LLM safety showing decoupling between harmfulness detection and refusal. ML interpretability research.

Ax Linus Folkerts, Will Payne, Simon Inman, Philippos Giavridis, Joe Skinner, Sam Deverett, James Aung, Ekin Zorer, Michael Schmatz, Mahmoud Ghanem, John Wilkinson, Alan Steer, Vy Hong, Jessica Wang 3/16/2026

Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios

Benchmark evaluating frontier AI models on multi-step cyber attack scenarios. Agent capability measurement across extended action sequences.

Ax Mayank Saini, Arit Kumar Bishwas 3/16/2026

One Supervisor, Many Modalities: Adaptive Tool Orchestration for Autonomous Queries

Agentic framework for multimodal query processing with adaptive tool orchestration across text/image/audio/video. Research on agent coordination and tool selection.

Ax Abhinaba Basu, Pavan Chakraborty 3/16/2026

Proof-Carrying Materials: Falsifiable Safety Certificates for Machine-Learned Interatomic Potentials

Proof-Carrying Materials: falsifiable safety certificates for machine-learned interatomic potentials. ML research on reliability guarantees for scientific models.

BL 3/16/2026

Why Codex Security Doesn’t Include a SAST Report

Codex Security: AI agent for code security that analyzes repository architecture and trust boundaries before validating findings with humans.

HN mooreds 3/15/2026

AI-as-Code for Agent Factories

AI-as-Code approach for agent factories.

HN mooreds 3/15/2026

Multi-agent fleet management for coding agents

Open-source AgentFactory orchestrates fleet of coding agents (Claude, Codex, Spring AI) through automated pipeline for issue resolution and code shipping.

HN jostylr 3/15/2026

OpenJarvis: Personal AI, on Personal Devices

Open-source framework for personal AI agents running entirely on-device with efficiency-aware evaluations and learning loop using local trace data.

HN EvanZhouDev 3/15/2026

Show HN: Free OpenAI API Access with ChatGPT Account

NPM package enabling free OpenAI API access via ChatGPT OAuth tokens. Creates localhost proxy to ChatGPT backend API with Vercel AI SDK provider support.

HN piotrgrudzien 3/15/2026

I'm Too Lazy to Check Datadog Every Morning, So I Made AI Do It

AI automation tool to summarize Datadog monitoring alerts and escalate issues, reducing manual dashboard review.

HN mattyhogan 3/15/2026

Show HN: Lux – Drop-in Redis replacement in Rust. 5.6x faster, ~1MB Docker image

Multi-threaded Redis replacement in Rust (5.6x faster, 1MB Docker image) with drop-in compatibility and concurrent architecture.

HN xpe 3/15/2026

LessWrong Policy on LLM Use

LessWrong editor UI update with Lexical framework and WYSIWYG improvements.

HN EwanG 3/15/2026

Sewage Dump Is Now One of America's Best Bird Sanctuaries [video]

Video about sewage facility becoming bird sanctuary. Off-topic.

HN tjohnell 3/15/2026

LLMs can be absolutely exhausting

Discussion of mental fatigue and workflow challenges when working with LLMs like Claude and Codex, and recovery strategies.

HN jeremyjh 3/15/2026

Claude-Code-Workflow – Orchestrate Multiple CLI Agents

Multi-agent workflow orchestration system supporting Gemini, Qwen, Claude with role-based agents, background execution, and visual workflow editing.

HN _kb 3/15/2026

AI is helping choose targets in Iran war – now it's a target too

Report on Iranian drone strikes against AWS data centers in UAE used for AI infrastructure.

HN Mofa1245 3/15/2026

Continuum – GitHub Action that detects LLM drift in CI

GitHub Action detecting LLM output drift in CI/CD by replaying workflows and diffing outputs to prevent silent model changes reaching production.

HN sroerick 3/15/2026

Show HN: Pakkun – Vibeslop Git for ETL

CLI tool for managing ETL transformation pipelines with artifact versioning and SQLite provenance tracking.

HN philangist 3/15/2026

Data scientist uses AI and ChatGPT to create cancer vaccine for his dying dog

Anecdotal story about data scientist using AI and ChatGPT to develop cancer vaccine for dog.

HN darshannere 3/15/2026

Show HN: ObservAgent – Observability for Claude Code(cost, tools, subagents)

Dashboard for real-time observability into Claude Code sessions, tracking costs, tool usage, and subagent execution without code changes.

HN yamafaktory 3/15/2026

Show HN: HypergraphZ – A Hypergraph Implementation in Zig

Hypergraph data structure implementation in Zig language with research community modeling example.

HN stuartmemo 3/15/2026

Seedance 2.0 delayed due to copyright disputes

ByteDance delays Seedance 2.0 video generation model launch due to copyright disputes with Hollywood studios.

HN kirpals99 3/15/2026

Faith Claw – Security middleware for autonomous AI agents (OpenClaw)

Security middleware for autonomous AI agents that risk-scores actions, detects injection attacks, and catches behavioral drift across multi-turn interactions.

HN n3on250 3/15/2026

Kuberna Labs – Open-source SDK for autonomous cross-chain AI agents

Open-source SDK for building autonomous AI agents that execute cross-chain financial operations with cryptographic guarantees and trusted execution environments.

HN Drew-Aetherwave 3/15/2026

Multi-agent coordination via timer-based Discord polling (Claude Code)

Multi-agent coordination system using Claude Code, Discord webhooks, and timer-based polling. Production autonomous workflows with real-time notifications.

HN earaujo 3/15/2026

Show HN: Claude's 2x usage promotion (March 2026) in your timezone

Timezone converter tool for Claude API usage promotion (Mar 2026). Minor LLM-adjacent utility.

HN quinndupont 3/15/2026

Integrity-Weighted Citation Metric

Academic citation metric (CiteIQ) weighted by author position and research integrity. Not AI/ML focused.

HN mooreds 3/15/2026

Securing AI Agents

Overview of layered security architecture for AI agents, emphasizing secure human identity verification and token-based authorization.

HN Sonofg0tham 3/15/2026

Show HN: Quell, a local security layer to stop AI IDEs leaking your secrets

Quell is a local security layer that intercepts prompts to AI IDEs, redacting secrets before they reach cloud models, storing values in OS keychain.

HN abekek 3/15/2026

Show HN: ARISE – Agents that create their own tools at runtime when they fail

ARISE framework enables LLM agents to synthesize their own tools at runtime when they encounter task gaps, adapting without pre-crafted tool libraries.

HN gounisalex 3/15/2026

Show HN: Turn any file into a CLI (reduce tokens vs. MCP)

clifast tool converts TypeScript/JavaScript functions into CLI packages with optimized help text for LLM navigation, reducing token usage versus MCP.

HN ninjaplavi 3/15/2026

Show HN: LearnFork – Branching AI chat for learning and researching

LearnFork tool for branching AI chat conversations in learning contexts with minimal details.

HN sydney-liveauth 3/15/2026

I Built LiveAuth: POW and Lightning Network Authentication for AI Agents

LiveAuth system providing Proof-of-Work and Lightning Network authentication for AI agents, replacing CAPTCHAs and API keys.

HN g_br_l 3/15/2026

Do you really need an agent?

Critical perspective on AI agent hype, questioning whether agents are necessary or overused in current implementations.

HN opsmeter 3/15/2026

Show HN: Opsmeter.io – AI cost attribution and budget control for LLM apps

Opsmeter tool for cost attribution and budget control in LLM applications, breaking down spending by endpoint, tenant, user, and model.

HN keterslater 3/15/2026

Caliber – AI setup tailored for your codebase

Caliber scans codebases to auto-generate tailored AI agent skills, configs, and recommended MCPs matching project stack and best practices.

HN eka_aibuilder 3/15/2026

Free LLM cost calculator – what your AI product costs across providers

Free tool for analyzing and comparing AI product costs across 9 LLM providers before implementation to identify optimal architecture.

HN colinprince 3/15/2026

I use the 'cupcake' prompt to catch when AI is guessing

Blog post on using 'cupcake' prompt technique to detect AI hallucinations.

HN turoczy 3/15/2026

The "are you sure?" Problem: Why AI keeps changing its mind

Analysis of LLM inconsistency when prompted repeatedly on same question, showing tendency to contradict prior responses.