Isolater - Feed

HN magnumpowerz 3/4/2026

Show HN: Ungrind – the solopreneur CRM that updates itself

CRM tool for solopreneur with self-updating capabilities. Minimal content available.

HN k4cper-g 3/4/2026

Show HN: CUP – MCP but for desktop UI (open spec for computer use agents)

Computer Use Protocol (CUP): universal schema for AI agents to perceive and interact with desktop UIs. Compact text encoding ~97% smaller than JSON for LLM context. Open spec for Windows, macOS, Linux, Web, Android, iOS.

HN luke14free 3/4/2026

Which LLMs fold under pressure? We made 6 LLMs argue 300 hard cases to find out

Benchmark testing 6 LLMs under adversarial pressure across 300 cases. Evaluates model resilience in argumentation and agentic workflows beyond standard capability tests.

HN onurkanbkrc 3/4/2026

Show HN: SFT to convert a base language model into a conversational chat model

Minimal code example demonstrating Supervised Fine-Tuning on Llama-2-7b using OpenAssistant dataset with parameter-efficient techniques to create chat model.

HN d0nk3yhm 3/4/2026

Donx64mcp-dbg – an injected DLL debugger toolkit with an MCP server for x64 apps

Windows x64 DLL debugger toolkit with MCP server for AI agents. Provides 40+ debug commands for real-time process inspection. Designed for security research and CTF.

HN tamnd 3/4/2026

Molmo 2: video understanding, pointing, and tracking

Molmo 2 open-source vision language model with state-of-the-art video understanding, pointing, and tracking capabilities. Hugging Face models available with training code.

HN rmhsilva 3/4/2026

Slack bot coding agent built on pi (mom)

Slack bot AI agent (Mom) powered by LLM. Executes bash commands, manages files, installs tools, configures credentials autonomously. Node.js app with Socket Mode integration.

HN selvan 3/4/2026

LLM Gateway: Budget enforcement, virtual API keys and usage analytics for LLMs

FastAPI-based LLM gateway proxy providing budget enforcement, virtual API key management, and usage analytics across multiple LLM providers.

HN jamesblonde 3/4/2026

Real-AI needs rolling aggregations and AI won't build them for you

Technical analysis of rolling aggregations as essential for real-time AI systems, covering incremental views and sub-millisecond latency approaches.

HN EDM115 3/4/2026

APM – Agent Package Manager (Microsoft)

Open-source Agent Package Manager by Microsoft. Dependency manager for AI agents declaring skills, prompts, instructions, and tools via apm.yml configuration files.

HN felixnotka 3/4/2026

Show HN: Audicia – Generate least-privilege Kubernetes RBAC from audit log

Open-source Kubernetes operator generating least-privilege RBAC policies from audit logs. Automates security policy creation from actual access patterns.

HN viruswami5511 3/4/2026

Show HN: GuardClaw – cryptographically verifiable execution logs for AI agents

GuardClaw implements cryptographically verifiable execution logs for autonomous AI agents using GEF-SPEC-1.0 protocol with append-only, immutable audit trails.

HN losalah 3/4/2026

What I do, and what I delegate to AI

Essay on treating AI as a leverage tool rather than productivity hack, discussing how to reshape work and decision-making with AI.

HN reavid 3/4/2026

Show HN: EU AI Radar – 60-second self-check for EU AI Act exposure

Static HTML quiz tool to help EU companies assess their risk tier under the EU AI Act, with news feed and regulatory tracking.

HN SebSeb83 3/4/2026

AuroriaLink

Real-time team messaging solution with encryption and file sharing. In active development.

HN imanhashemi 3/4/2026

Show HN: Retro – active context curator for coding agents

Open-source Rust tool that manages context for AI coding agents using git hooks and SQLite, analyzing agent conversations to optimize performance on large codebases.

HN huey77 3/4/2026

Nuclear War: An LLM Scenario

Minimal stub post about nuclear war scenarios with LLMs, no content provided.

HN Bridgeye 3/4/2026

Show HN: Nova – AI terminal that writes, fixes, and ships your code

Nova is an AI-native developer workspace that executes code directly, eliminating the iterative chat-paste-error cycle of traditional AI coding assistants.

Ax Quinn Jacobson, Joe Luo, Jingfei Xu, Shanmuga Venkatachalam, Kevin Wang, Dingchao Rong, John Paul Shen 3/4/2026

NeuroHex: Highly-Efficient Hex Coordinate System for Creating World Models to Enable Adaptive AI

Hexagonal coordinate system for efficient world models in adaptive AI, inspired by grid cells in human brain. Mathematical framework for rotational symmetry and low-cost spatial computation.

Ax Qiyuan Zhang, Junyi Zhou, Yufei Wang, Fuyuan Lyu, Yidong Ming, Can Xu, Qingfeng Sun, Kai Zheng, Peng Kang, Xue Liu, Chen Ma 3/4/2026

RubricBench: Aligning Model-Generated Rubrics with Human Standards

RubricBench benchmark for evaluating rubric-guided LLM reward models against human standards, addressing discriminative complexity in alignment evaluation.

Ax Jiahao Huang, Fengyan Lin, Xuechao Yang, Chen Feng, Kexin Zhu, Xu Yang, Zhide Chen 3/4/2026

Nano-EmoX: Unifying Multimodal Emotional Intelligence from Perception to Empathy

Nano-EmoX framework unifying multimodal emotional intelligence across perception, understanding, and interaction levels with cognitively-inspired hierarchy.

Ax Robin Young 3/4/2026

What Is the Alignment Tax?

Geometric theory formalizing alignment tax as projection in representation space, deriving Pareto frontier for safety-capability tradeoffs in LLMs.

Ax Hongjin Qian, Ziyi Xia, Ze Liu, Jianlyu Chen, Kun Luo, Minghao Qin, Chaofan Li, Lei Xiong, Junwei Lan, Sen Wang, Zhengyang Liang, Yingxia Shao, Defu Lian, Zheng Liu 3/4/2026

DeepXiv-SDK: An Agentic Data Interface for Scientific Literature

SDK providing LLM-agents structured data access to scientific literature via agentic interface, reducing token consumption and improving retrieval efficiency.

Ax Pr\'axedes Mart\'inez-Moreno, Andrea Valsecchi, Pablo Mesejo, Pilar Navarro-Ram\'irez, Valentino Lugli, Sergio Damas 3/4/2026

A Novel Evolutionary Method for Automated Skull-Face Overlay in Computer-Aided Craniofacial Superimposition

Evolutionary algorithm for automated skull-face overlay alignment in forensic craniofacial superimposition using 3D skull and 2D facial image correspondence.

Ax Grigory Sapunov 3/4/2026

Theory of Code Space: Do Code Agents Understand Software Architecture?

Theory of Code Space benchmark evaluating whether AI code agents understand software architecture through multi-file codebase exploration in procedurally generated environments.

Ax Zhanwang Liu, Yuting Li, Haoyuan Gao, Yexin Li, Linghe Kong, Lichao Sun, Weiran Huang 3/4/2026

IDER: IDempotent Experience Replay for Reliable Continual Learning

IDER method addressing catastrophic forgetting in continual learning through idempotent experience replay with uncertainty calibration.

Ax Zhonghang Li, Zongwei Li, Yuxuan Chen, Han Shi, Jiawei Li, Jierun Chen, Haoli Bai, Chao Huang 3/4/2026

FastCode: Fast and Cost-Efficient Code Understanding and Reasoning

FastCode system for efficient repository-scale code reasoning using selective context retrieval and compression for cost-effective LLM-based software engineering.

Ax Yixuan Tang, Zhenghong Lin, Yandong Sun, Wynne Hsu, Mong Li Lee, Anthony K. H. Tung 3/4/2026

QIME: Constructing Interpretable Medical Text Embeddings via Ontology-Grounded Questions

Proposes QIME framework for interpretable biomedical text embeddings using ontology-grounded natural language questions for clinical decision-making.

Ax Noura Al Helwani, Sophie Moufawad, Georges Sakr 3/4/2026

Solving Inverse PDE Problems using Minimization Methods and AI

Compares numerical methods with physics-informed neural networks for solving direct and inverse PDE problems in physical/engineering systems.

Ax Naoki Shitanda, Motoki Omura, Tatsuya Harada, Takayuki Osa 3/4/2026

Rethinking Policy Diversity in Ensemble Policy Gradient in Large-Scale Reinforcement Learning

Studies policy diversity in ensemble policy gradient methods for large-scale RL, analyzing exploration-exploitation tradeoffs across parallel environments.

Ax Harry Amad, Mihaela van der Schaar 3/4/2026

Hyperparameter Trajectory Inference with Conditional Lagrangian Optimal Transport

Hyperparameter trajectory inference framework using conditional Lagrangian optimal transport to enable post-deployment hyperparameter adjustments without retraining.

Ax Bowen Zhang, Junchuan Zhao, Ian McLoughlin, Ye Wang, A S Madhukumar 3/4/2026

CodecFlow: Efficient Bandwidth Extension via Conditional Flow Matching in Neural Codec Latent Space

Speech bandwidth extension method using conditional flow matching in neural codec latent space for improved clarity and intelligibility.

Ax Zongru Wu, Rui Mao, Zhiyuan Tian, Pengzhou Cheng, Tianjie Ju, Zheng Wu, Lingzhong Dong, Haiyue Sheng, Zhuosheng Zhang, Gongshen Liu 3/4/2026

See, Think, Act: Teaching Multimodal Agents to Effectively Interact with GUI by Identifying Toggles

Evaluates multimodal GUI agents' ability to identify and execute toggle controls, revealing reliability bottlenecks in ubiquitous GUI interaction.

Ax Adrian Robert Minut, Hazem Dewidar, Iacopo Masi 3/4/2026

Spilled Energy in Large Language Models

Reinterprets LLM softmax as energy-based model to track 'energy spills' during decoding, correlating them with factual errors and biases.

Ax Kwanyoung Kim, Sanghyun Kim 3/4/2026

Model Already Knows the Best Noise: Bayesian Active Noise Selection via Attention in Video Diffusion Model

Proposes ANSE method using Bayesian active noise selection with attention mechanisms to improve video diffusion quality by selecting optimal initial noise seeds.

Ax Shai Yehezkel, Omer Dahary, Andrey Voynov, Daniel Cohen-Or 3/4/2026

Navigating with Annealing Guidance Scale in Diffusion Space

Research on classifier-free guidance scale annealing in diffusion models to improve image quality and prompt alignment convergence during sampling.

Ax Chunyang Li, Yilun Zheng, Xinting Huang, Tianqing Fang, Jiahao Xu, Lihui Chen, Yangqiu Song, Han Hu 3/4/2026

WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality

WebDevJudge benchmark evaluates LLMs-as-judges for web development quality assessment, testing reliability on open-ended tasks with dynamic environments.

Ax Ran Li, Shimin Di, Haowei LI, Luanshi Bu, Jiachuan Wang, Wangze Ni, Lei Chen 3/4/2026

RxnNano:Training Compact LLMs for Chemical Reaction and Retrosynthesis Prediction via Hierarchical Curriculum Learning

RxnNano trains compact LLMs for chemical reaction prediction using hierarchical curriculum learning, emphasizing chemical intuition over parameter scaling.

Ax Ruike Cao, Shaojie Bai, Fugen Yao, Liang Dong, Jian Xu, Li Xiao 3/4/2026

ATPO: Adaptive Tree Policy Optimization for Multi-Turn Medical Dialogue

ATPO uses hierarchical reinforcement learning to optimize LLM behavior for multi-turn medical dialogues with incomplete information.

Ax Sieun Hyeon, Jaeyoung Do 3/4/2026

Is Retraining-Free Enough? The Necessity of Router Calibration for Efficient MoE Compression

Analysis of MoE compression methods identifies router-expert mismatch as key degradation factor; proposes calibration approach for efficient model deployment.

Ax Wei Liu, Siya Qi, Yali Du, Yulan He 3/4/2026

Self-Play Only Evolves When Self-Synthetic Pipeline Ensures Learnable Information Gain

Research on self-play loops in LLMs showing sustainable self-evolution requires learnable information gain, not just more synthetic data generation.

Ax Junfeng Fang, Nachuan Chen, Houcheng Jiang, Dan Zhang, Fei Shen, Xiang Wang, Xiangnan He, Tat-Seng Chua 3/4/2026

NExT-Guard: Training-Free Streaming Safeguard without Token-Level Labels

NExT-Guard provides training-free safeguarding for streaming LLM deployments without requiring token-level annotations or supervision.

Ax Yixin Wang, Yifan Hu, Peiyuan Liu, Naiqi Li, Dai Tao, Shu-Tao Xia 3/4/2026

Forecasting as Rendering: A 2D Gaussian Splatting Framework for Time Series Forecasting

Novel 2D Gaussian Splatting approach for time series forecasting that reshapes 1D sequences to preserve chronological continuity.

Ax Zizheng Zhang, Yiming Li, Justin Xu, Jinyu Wang, Rui Wang, Lei Song, Jiang Bian, David W Eyre, Jingjing Fu 3/4/2026

MedFeat: Model-Aware and Explainability-Driven Feature Engineering with LLMs for Clinical Tabular Prediction

MedFeat integrates LLM domain knowledge into feature engineering for clinical tabular prediction, balancing model characteristics with feature importance signals.

Ax Artus Krohn-Grimberghe 3/4/2026

MedCalc-Bench Doesn't Measure What You Think: A Benchmark Audit and the Case for Open-Book Evaluation

Audit of MedCalc-Bench clinical calculator benchmark reveals implementation issues and proposes open-book evaluation methodology for more accurate LLM assessment.

Ax Sazzad Bin Bashar Polock, Anandi Dutta, Subasish Das 3/4/2026

Characterizing and Predicting Wildfire Evacuation Behavior: A Dual-Stage ML Approach

ML research paper using correspondence analysis, clustering, and classification to model wildfire evacuation behavior from survey data.

Ax Brady Steele 3/4/2026

Subspace Geometry Governs Catastrophic Forgetting in Low-Rank Adaptation

Geometric theory of catastrophic forgetting in LoRA through gradient subspace interactions, deriving quantitative forgetting formula.

Ax Jingxuan Fan, Yueying Li, Zhenting Qi, Dinghuai Zhang, Kiant\'e Brantley, Sham M. Kakade, Hanlin Zhang 3/4/2026

Scaling Reward Modeling without Human Supervision

Unsupervised reward modeling scaling via preference learning on web document prefixes/suffixes, reducing human annotation costs.

Ax Bojian Yin, Shurong Wang, Haoyu Tan, Sander Bohte, Federico Corradi, Guoqi Li 3/4/2026

Efficient Sparse Selective-Update RNNs for Long-Range Sequence Modeling

Efficient RNN architecture with selective state updates for long-range sequence modeling, reducing unnecessary computation on static inputs.

Ax Liang Chen, Qi Liu 3/4/2026

Neural Paging: Learning Context Management Policies for Turing-Complete Agents

Neural Paging architecture enabling Turing-complete agents by learning hierarchical context window management policies.