Isolater - Feed

HN knlsn 3/18/2026

Your terminal, finally has memory!

Terminal tool with local AI memory using Ollama. Save/recall commands, notes, URLs via natural language. Runs locally, no cloud.

HN wkowalpl 3/18/2026

Rust-accelerated reinforcement learning, 140x faster than Python

Rust-accelerated RL framework using Polars pattern: Rust data plane + Python control plane via PyO3. 140x speedup with Rayon parallelism. Published on crates.io with 695 tests.

HN tartoran 3/18/2026

Iranian security chief Ali Larijani killed in air strike

News article about Iranian security official killed in airstrike; unrelated to AI/tech.

HN amarkdown 3/18/2026

Show HN: Vibe – Code from your bed or the park with Claude Code/Codex

Vibe is a mobile app enabling remote code execution with Claude Code and Gemini CLI, with web preview and session management.

HN AndyNemmity 3/18/2026

My Claude Code setup you definitely shouldn't use. It's AI Overkill

Personal setup combining Claude Code with specialized domain agents, parallel code review, and self-improving knowledge systems.

HN Uptrenda 3/18/2026

Forget Flags and Scripts: Just Rename the File

Programming pattern using filenames as configuration to make programs self-contained and portable without flags or scripts.

HN handfuloflight 3/18/2026

We optimized Dash's relevance judge with DSPy

Dropbox optimized their relevance judge using DSPy for Dash, improving ranking and evaluation across multiple ML pipelines at scale.

HN kirillostrovsky 3/18/2026

TrustAgentAI – Cryptographic receipts for MCP tool calls (non-repudiation layer)

TrustAgentAI is an open-source accountability layer adding cryptographic receipts and non-repudiation to MCP tool calls for AI agents.

HN MaysonL 3/18/2026

Gas Town by Kilo

Gas Town is Steve Yegge's agent orchestrator coordinating multiple AI coding agents simultaneously, hosted on Kilo Cloud infrastructure.

Ax Lihui Liu 3/18/2026

Neural-Symbolic Logic Query Answering in Non-Euclidean Space

HYQNET is a neural-symbolic model that answers complex first-order logic queries on knowledge graphs by integrating interpretability with generalization.

Ax Zeyu Zhang, Rui Li, Xiaoyan Zhao, Yang Zhang, Wenjie Wang, Xu Chen, Tat-Seng Chua 3/18/2026

NextMem: Towards Latent Factual Memory for LLM-based Agents

NextMem proposes a latent factual memory framework for LLM-based agents to address limitations of existing textual and parametric memory approaches.

Ax Yibo Yang, Fei Lei, Yixuan Sun, Yantao Zeng, Chengguang Lv, Jiancao Hong, Jiaojiao Tian, Tianyu Qiu, Xin Wang, Yanbing Chen, Yanjie Li, Zheng Pan, Xiaochen Zhou, Guanzhou Chen, Haoran Lv, Yuning Xu, Yue Ou, Haodong Liu, Shiqi He, Anya Jia, Yulei Xin, Huan Wu, Liang Liu, Jiaye Ge, Jianxin Dong, Dahua Lin, Wenxiu Sun 3/18/2026

AIDABench: AI Data Analytics Benchmark

AIDABench: Comprehensive benchmark for AI data analytics and document understanding. Evaluates end-to-end task effectiveness in practical document processing scenarios.

Ax Rahul Baxi 3/18/2026

The Comprehension-Gated Agent Economy: A Robustness-First Architecture for AI Economic Agency

Comprehension-Gated Agent Economy: Formal architecture linking AI agent economic permissions to verified comprehension. Robustness-first approach to agent authorization.

Ax Pearl Mody, Mihir Panchal, Rishit Kar, Kiran Bhowmick, Ruhina Karani 3/18/2026

CraniMem: Cranial Inspired Gated and Bounded Memory for Agentic Systems

CraniMem: Neurocognitively-inspired gated and bounded multi-stage memory design for long-running LLM agents. Improves retention stability and content consolidation.

Ax Shaohuang Wang 3/18/2026

GSI Agent: Domain Knowledge Enhancement for Large Language Models in Green Stormwater Infrastructure

GSI Agent: Domain knowledge enhancement for LLMs in green stormwater infrastructure. Combines LLM with domain knowledge for inspection and maintenance guidance.

Ax Madhava Gaikwad 3/18/2026

Did You Check the Right Pocket? Cost-Sensitive Store Routing for Memory-Augmented Agents

Cost-sensitive store routing for memory-augmented agents. Formulates selective memory retrieval as routing problem to reduce context tokens and improve efficiency.

Ax Yu Li, Qiang Hu, Yao Zhang, Lili Quan, Jiongchi Yu, Junjie Wang 3/18/2026

DynaTrust: Defending Multi-Agent Systems Against Sleeper Agents via Dynamic Trust Graphs

DynaTrust: Defense mechanism against sleeper agents in multi-agent systems using dynamic trust graphs. Detects agents that hide malicious behavior until triggered.

Ax Zhang Edward 3/18/2026

QV May Be Enough: Toward the Essence of Attention in LLMs

Theoretical analysis of Query-Value mechanism in Transformers from linguistic perspective. Explains efficacy of MQA, GQA, and MLA architectures and trade-offs.

Ax James Rhodes, George Kang 3/18/2026

Compiled Memory: Not More Information, but More Precise Instructions for Language Agents

Atlas: Memory kernel that compiles task experience into agent instructions without fine-tuning or RAG. Improves agent memory utility via instruction-level compilation.

Ax Arit Kumar Bishwas, Mousumi Sen, Albert Nieto-Morales, Joel Jacob Varghese 3/18/2026

Quantum-Secure-By-Construction (QSC): A Paradigm Shift For Post-Quantum Agentic Intelligence

Quantum-Secure-By-Construction design paradigm for agentic AI systems. Addresses post-quantum cryptographic challenges in long-lived distributed agent deployments.

Ax Aliyu Agboola Alege 3/18/2026

I Know What I Don't Know: Latent Posterior Factor Models for Multi-Evidence Probabilistic Reasoning

Latent Posterior Factors framework for aggregating multiple noisy evidence sources without manual feature engineering. Addresses uncertainty in real-world decision-making.

Ax Aliyu Agboola Alege 3/18/2026

Theoretical Foundations of Latent Posterior Factors: Formal Guarantees for Multi-Evidence Reasoning

Theoretical characterization of Latent Posterior Factors for aggregating heterogeneous evidence in probabilistic prediction. Formal guarantees for multi-evidence reasoning.

Ax Trishita Dhara, Siddhesh Sheth 3/18/2026

Context-Length Robustness in Question Answering Models: A Comparative Empirical Study

Empirical study measuring LLM robustness to increasing context length on SQuAD and HotpotQA. Analyzes accuracy degradation with context size.

Ax Alexandre Lacoste, Nicolas Gontier, Oleh Shliazhko, Aman Jaiswal, Kusha Sareen, Shailesh Nanisetty, Joan Cabezas, Manuel Del Verme, Omar G. Younis, Simone Baratta, Matteo Avalle, Imene Kerboua, Xing Han L\`u, Elron Bandel, Michal Shmueli-Scheuer, Asaf Yehudai, Leshem Choshen, Jonathan Lebensold, Sean Hughes, Massimo Caccia, Alexandre Drouin, Siva Reddy, Tao Yu, Yu Su, Graham Neubig, Dawn Song 3/18/2026

CUBE: A Standard for Unifying Agent Benchmarks

CUBE: Universal benchmark standard for AI agents built on MCP and Gym. Addresses fragmentation by allowing benchmarks to be wrapped once and used everywhere.

Ax Vatsal Gupta, Darshan Sreenivasamurthy 3/18/2026

Prose2Policy (P2P): A Practical LLM Pipeline for Translating Natural-Language Access Policies into Executable Rego

Prose2Policy: LLM pipeline translating natural-language access control policies into executable Rego code. End-to-end pipeline with test generation and validation.

Ax Sankalp Dubedy 3/18/2026

Persona-Conditioned Risk Behavior in Large Language Models: A Simulated Gambling Study with GPT-4.1

Empirical study of GPT-4.1 behavior in gambling tasks under different persona prompts. Examines whether LLM risk behavior reflects principled patterns or prompt mimicry.

Ax Pranaya Jajoo, Harshit Sikchi, Siddhant Agarwal, Amy Zhang, Scott Niekum, Martha White 3/18/2026

Regularized Latent Dynamics Prediction is a Strong Baseline For Behavioral Foundation Models

Regularized latent dynamics prediction as baseline for behavioral foundation models, examining how state feature choice affects task adaptability and reward function expressivity.

Ax Puneet Sharma, Christer Henrik Pursiainen 3/18/2026

Resilience Meets Autonomy: Governing Embodied AI in Critical Infrastructure

Framework for governing embodied AI in critical infrastructure through hybrid oversight modes and bounded autonomy, addressing resilience beyond statistically representable uncertainty.

Ax Andrea Tupini, Lars Liden, Reuben Tan, Yu Wang, Jianfeng Gao 3/18/2026

AsgardBench - Evaluating Visually Grounded Interactive Planning Under Minimal Feedback

AsgardBench evaluates visually grounded interactive planning for embodied AI agents, focusing on high-level action sequence generation with plan adaptation based on visual feedback.

Ax Lara Lee Russell-Lasalandra, Hudson Golino 3/18/2026

Prompt Engineering for Scale Development in Generative Psychometrics

Monte Carlo simulation evaluating prompt engineering strategies for LLM-generated personality assessment items across zero-shot, few-shot, and persona-based designs.

Ax Vasily Ilin 3/18/2026

Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau Equilibrium

Lean 4 formalization of Vlasov-Maxwell-Landau equilibrium using AI reasoning (Gemini DeepThink) and agentic tools (Claude Code) demonstrating AI-assisted mathematical research workflows.

Ax Stylianos Loukas Vasileiou, Antonio Rago, Francesca Toni, William Yeoh 3/18/2026

Argumentative Human-AI Decision-Making: Toward AI Agents That Reason With Us, Not For Us

Framework combining computational argumentation with LLMs to create transparent, verifiable AI agents that reason collaboratively with humans rather than providing opaque recommendations.

Ax Jacopo Teneggi, S. M. Bargeen A. Turzo, Tanya Marwah, Alberto Bietti, P. Douglas Renfrew, Vikram Khipple Mulligan, Siavash Golkar 3/18/2026

Protein Design with Agent Rosetta: A Case Study for Specialized Scientific Agents

Agent Rosetta uses LLMs as specialized scientific agents for protein design tasks, emulating reasoning and tool use for broad design pipelines beyond canonical amino acids.

Ax Rushil Thareja, Gautam Gupta, Francesco Pinto, Nils Lukas 3/18/2026

MAC: Multi-Agent Constitution Learning

MAC automatically learns constitutional AI rules from training data using multi-agent approaches, improving upon existing LLM-based prompt optimizers through structured learning.

Ax Cosimo Spera 3/18/2026

Safety is Non-Compositional: A Formal Framework for Capability-Based AI Systems

Formal proof that safety is non-compositional: two individually incapable agents can collectively reach forbidden goals through emergent conjunctive capability dependencies.

Ax Hong Zhang, Barry Smith, Satish Balay, Le Chen, Murat Keceli, Lois Curfman McInnes, Junchao Zhang 3/18/2026

An Agentic Evaluation Framework for AI-Generated Scientific Code in PETSc

petscagent-bench evaluates AI-generated scientific code for HPC libraries beyond test-case matching, assessing solver selection, API conventions, memory management, and performance.

Ax Oliver Zahn, Simran Chana 3/18/2026

Selective Memory for Artificial Intelligence: Write-Time Gating with Hierarchical Archiving

Write-time gating mechanism filters incoming knowledge objects based on salience scores to improve retrieval-augmented generation accuracy and mirror biological memory archiving.

Ax Veronique Ziegler 3/18/2026

IRAM-Omega-Q: A Computational Architecture for Uncertainty Regulation in Artificial Agents

IRAM-Omega-Q computational architecture uses quantum-like density matrices to model internal regulation and uncertainty management in artificial agents under stochastic perturbation.

Ax Jake Van Clief, David McDermott 3/18/2026

Interpretable Context Methodology: Folder Structure as Agentic Architecture

Model Workspace Protocol (MWP) simplifies agentic AI orchestration using folder structures for sequential workflows, reducing engineering overhead compared to multi-agent frameworks.

Ax Dongik Shin 3/18/2026

Enhancing Linguistic Generalization of VLA: Fine-Tuning OpenVLA via Synthetic Instruction Augmentation

Enhances OpenVLA vision-language-action models with synthetic instruction augmentation to improve zero-shot performance in new environments for embodied AI tasks.

Ax Jungwoo Shim, Dae Won Kim, Sun Wook Kim, Soo Young Kim, Myungcheol Lee, Jae-geun Cha, Hyunhwa Choi 3/18/2026

POaaS: Minimal-Edit Prompt Optimization as a Service to Lift Accuracy and Cut Hallucinations on On-Device sLLMs

POaaS optimizes prompts for on-device small language models through minimal edits, reducing hallucinations and improving accuracy without requiring lengthy structured instructions.

Ax Ding Wei 3/18/2026

A Context Alignment Pre-processor for Enhancing the Coherence of Human-LLM Dialog

Context alignment pre-processor enhances LLM dialogue coherence by resolving contextual misalignment when users omit premises, simplify references, or shift context during interactions.

Ax Yu Li, Rui Miao, Zhengling Qi, Tian Lan 3/18/2026

ARISE: Agent Reasoning with Intrinsic Skill Evolution in Hierarchical Reinforcement Learning

ARISE uses hierarchical reinforcement learning to improve mathematical reasoning in LLMs by developing reusable strategies that accumulate during training rather than treating problems in isolation.

Ax Sarthak Ahuja, Neda Kordjazi, Evren Yortucboylu, Vishaal Kapoor, Mariam Dundua, Yiming Li, Derek Ho, Vaibhavi Padala, Jennifer Whitted, Rebecca Steinert 3/18/2026

VIGIL: Towards Edge-Extended Agentic AI for Enterprise IT Support

VIGIL deploys edge-resident AI agents for enterprise IT support, performing diagnosis, knowledge retrieval, and policy-governed remediation on user devices with consent and observability.

Ax Zhengzheng Tang 3/18/2026

NeuronSpark: A Spiking Neural Network Language Model with Selective State Space Dynamics

NeuronSpark: 0.9B-parameter spiking neural network language model using state-space dynamics and surrogate gradients without Transformer distillation.

Ax Long Li, Zhijian Zhou, Jiangxuan Long, Peiyang Liu, Weidi Xu, Zhe Wang, Shirui Pan, Chao Qu 3/18/2026

SQL-ASTRA: Alleviating Sparse Feedback in Agentic SQL via Column-Set Matching and Trajectory Aggregation

SQL-ASTRA: agentic reinforcement learning framework for text-to-SQL using column-set matching and trajectory aggregation for credit assignment.

Ax Eshwar Reddy M, Sourav Karmakar 3/18/2026

Are Large Language Models Truly Smarter Than Humans?

Data contamination audit reveals public LLM benchmarks may be leaked in training data; questions claims of superhuman performance.

Ax Xinxin Jin, Zhengwei Ni, Zhengguo Sheng, Victor C. M. Leung 3/18/2026

Proactive Rejection and Grounded Execution: A Dual-Stage Intent Analysis Paradigm for Safe and Efficient AIoT Smart Homes

Framework for safe LLM-based IoT agents using dual-stage intent analysis to prevent hallucination and reduce interaction overhead.

Ax Jingyu Peng, Hongyu Chen, Jiancheng Dong, Maolin Wang, Wenxi Li, Yuchen Li, Kai Zhang, Xiangyu Zhao 3/18/2026

MOSAIC: Composable Safety Alignment with Modular Control Tokens

MOSAIC: modular control token approach for context-dependent safety alignment in LLMs across applications and regions.

Ax Chunjiang Mu, Ya Zeng, Qiaosheng Zhang, Kun Shao, Chen Chu, Hao Guo, Danyang Jia, Zhen Wang, Shuyue Hu 3/18/2026

Adaptive Theory of Mind for LLM-based Multi-Agent Coordination

Adaptive theory of mind framework for LLM-based multi-agent coordination, aligning agents' reasoning depth about others' mental states.