Isolater - Feed

HN solscan_dev 7d ago

Sovereign AI

Promotional content about decentralized AI using InnoChain technology to democratize model training.

HN 01-_- 7d ago

China's government both drives and constrains the rise of AI

Overview of AI adoption across Chinese apps, universities, and consumer products with government support and constraints.

HN signa11 7d ago

The Snake That Ate Itself: What Claude Source Revealed About AI Engg Culture

Analysis of exposed Claude Code source revealing AI engineering practices: 64K lines of core TypeScript in customer-facing code.

HN mchinen 7d ago

Audio Flamingo Next: Open audio-language models for speech, sound, and music

Audio Flamingo Next: open-source audio-language models for speech, sound, and music understanding.

HN bellamoon544 7d ago

Whisk AI

Google Labs Whisk AI: free image generator that blends three visual inputs (subject, scene, style) using Gemini and Imagen 3.

HN tie-in 7d ago

Call Me a Jerk: Persuading AI to Comply with Objectionable Requests

Research showing LLMs respond to social persuasion techniques (authority, commitment, unity) similarly to humans, raising compliance concerns.

HN adinagoerres 7d ago

Show HN: A CLI that writes its own integration code

Open source agentic integration platform with CLI that auto-generates integration code from natural language descriptions.

HN aledevv 7d ago

Two-Stage Semantic Chunking for RAG in Python

Two-stage semantic chunking pipeline for RAG using LlamaIndex: structural splitting then semantic coherence for better document handling.

HN teichmann 7d ago

An AI Vibe Coding Horror Story

Case study of AI vibe coding failure: non-technical person built faulty patient management system instead of using proven solutions.

HN stared 7d ago

Compare harnesses not models: Blitzy vs. GPT-5.4 on SWE-Bench Pro

Analysis of AI agent harnesses vs models for enterprise codebases, comparing Blitzy and GPT-5.4 on SWE-Bench Pro.

HN liquidclash 7d ago

LiquidClash – A native macOS proxy client with Liquid Glass UI

Native macOS proxy client built with SwiftUI using Liquid Glass UI design.

HN sks147 7d ago

The Complete Guide to React Native Build Optimization

Guide on optimizing React Native Android release builds through multi-core CPU utilization and build tool configuration.

HN traceable_dev 7d ago

Jarvis – governed AI control plane with receipts, rollback, and agent guardrails

Governance control plane for AI systems enforcing human oversight, rollback capabilities, and agent guardrails with audit receipts.

HN runningmike 7d ago

Security Concerns in Generative AI Coding Assistants

arXivLabs framework announcement for collaborative feature development with focus on openness and data privacy.

HN ctack 7d ago

Quantified evidence: Sonnet 4.6 quality regression

User analysis claiming quality regression in Claude Sonnet 4.6 based on 60-day conversation logs tracking instruction repetition frequency.

HN ywian 7d ago

Trace your Claude Code easily

Desktop and web viewer for Claude Code session logs with expandable tool calls and token tracking, built with Tauri and React.

HN zagwdt 7d ago

Introspective Diffusion Language Models

Introduces Introspective Diffusion Language Models (I-DLM) using strided decoding to improve parallel token generation quality versus autoregressive models.

HN melvinmelih 7d ago

Show HN: Early Reader – Free open source reading app I built for my 4-year-old

Open source reading app for children using DISTAR phonics method. Not AI/ML focused.

HN sunandsurf 7d ago

Show HN: A Bomberman-style 1v1 game where LLMs compete in real time

Bomberman-style 1v1 game benchmark where LLM agents compete in real-time interactive environment, inspired by ARC-AGI 3.

Ax Hiroyuki Chuma, Kanji Otsuka, Yoichi Sato 7d ago

Beyond LLMs, Sparse Distributed Memory, and Neuromorphics <A Hyper-Dimensional SRAM-CAM "VaCoAl" for Ultra-High Speed, Ultra-Low Power, and Low Cost>

Hyperdimensional computing architecture based on Galois-field algebra showing path-dependent semantic selection mechanism.

Ax Hanqi Xiao, Vaidehi Patil, Zaid Khan, Hyunji Lee, Elias Stengel-Eskin, Mohit Bansal 7d ago

Playing Along: Learning a Double-Agent Defender for Belief Steering via Theory of Mind

Double-agent defender using theory-of-mind reasoning to protect LLMs from belief-steering attacks in adversarial dialogue.

Ax Mingyang Li, Haofan Xu, Haowen Sun, Xinzhe Chen, Sihua Ren, Liqi Huang, Xinyang Sui, Chenyang Miao, Qiongjie Cui, Zeyang Liu, Xingyu Chen, Xuguang Lan 7d ago

AffordSim: A Scalable Data Generator and Benchmark for Affordance-Aware Robotic Manipulation

AffordSim generates synthetic robotic manipulation data incorporating object affordances for semantically correct grasp and interaction trajectories.

Ax Jieying Xue, Phuong Minh Nguyen, Ha Thanh Nguyen, May Myo Zin, Ken Satoh 7d ago

Legal2LogicICL: Improving Generalization in Transforming Legal Cases to Logical Formulas via Diverse Few-Shot Learning

Legal2LogicICL uses diverse few-shot learning with LLMs to improve generalization when converting legal cases to logical formulas.

Ax Nicolas Rodriguez-Alvarez (Instituto de Educacion Secundaria Parquesol, Valladolid, Spain), Fernando Rodriguez-Merino (University of Valladolid, Valladolid, Spain) 7d ago

Fairness is Not Flat: Geometric Phase Transitions Against Shortcut Learning

Geometric methodology to mitigate shortcut learning and demographic bias in deep neural networks through topological constraints.

Ax Andreas M\"uller, Denis Lukovnikov, Shingo Kodama, Minh Pham, Anubhav Jain, Jonathan Petit, Niv Cohen, Asja Fischer 7d ago

On the Robustness of Watermarking for Autoregressive Image Generation

Evaluates robustness of watermarking techniques for autoregressive image generators against detection evasion and removal attacks.

Ax Ryan Faulkner, Anushka Deshpande, David Guzman Piedrahita, Joel Z. Leibo, Zhijing Jin 7d ago

Evaluating Cooperation in LLM Social Groups through Elected Leadership

Studies whether LLM-based agents improve cooperation in common-pool resource management through structured leadership and election mechanisms.

Ax Saad Alqithami 7d ago

Endogenous Information in Routing Games: Memory-Constrained Equilibria, Recall Braess Paradoxes, and Memory Design

Game theory analysis of routing decisions with memory constraints and endogenous information recall using logit choice models.

Ax Haojie Bai, Aimin Li, Ruoyu Yao, Xiongwei Zhao, Tingting Zhang, Xing Zhang, Lin Gao, and Jun Ma 7d ago

Multi-ORFT: Stable Online Reinforcement Fine-Tuning for Multi-Agent Diffusion Planning in Cooperative Driving

Multi-ORFT stabilizes online reinforcement fine-tuning for multi-agent diffusion models in cooperative autonomous driving scenarios.

Ax Hongli Zhan, Emma S. Gueorguieva, Javier Hernandez, Jina Suh, Desmond C. Ong, Junyi Jessy Li 7d ago

Discourse Diversity in Multi-Turn Empathic Dialogue

Analysis of discourse diversity in multi-turn empathic dialogue, examining LLM formulaicity beyond single-turn settings.

Ax Quanyi Li, Lan Feng, Haonan Zhang, Wuyang Li, Letian Wang, Alexandre Alahi, Harold Soh 7d ago

Grounded World Model for Semantically Generalizable Planning

Grounded world models for visuomotor planning using pretrained vision encoders, enabling semantic generalization without explicit goal images.

Ax Jinhui Ye, Ning Gao, Senqiao Yang, Jinliang Zheng, Zixuan Wang, Yuxin Chen, Pengguang Chen, Yilun Chen, Shu Liu, Jiaya Jia 7d ago

StarVLA-$\alpha$: Reducing Complexity in Vision-Language-Action Systems

StarVLA-α simplifies Vision-Language-Action models for robotic agents by studying unified design choices across architectures and training data.

Ax Ricardo Coimbra Brioso, Giulio Sichili, Damiano Dei, Nicola Lambri, Pietro Mancosu, Marta Scorsetti, Daniele Loiacono 7d ago

Efficient KernelSHAP Explanations for Patch-based 3D Medical Image Segmentation

Efficient KernelSHAP explainability method for patch-based 3D medical image segmentation with reduced computational cost.

Ax Junlin Liu, Shengnan An, Shuang Zhou, Dan Ma, Shixiong Luo, Ying Xie, Yuan Zhang, Wenling Yuan, Yifan Zhou, Xiaoyu Li, Ziwen Wang, Xuezhi Cao, Xunliang Cai 7d ago

General365: Benchmarking General Reasoning in Large Language Models Across Diverse and Challenging Tasks

Benchmark for evaluating general reasoning capabilities of LLMs across diverse challenging tasks beyond domain-specific reasoning.

Ax Fei Tang, Zhiqiong Lu, Boxuan Zhang, Weiming Lu, Jun Xiao, Yueting Zhuang, Yongliang Shen 7d ago

ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents

Full-stack infrastructure for training, evaluating, and deploying GUI agents with online RL and unified evaluation framework.

Ax Wei Zhao, Zhe Li, Peixin Zhang, Jun Sun 7d ago

ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection

Runtime security framework protecting tool-augmented LLM agents against indirect prompt injection attacks through tool-returned content.

Ax Hugh Blayney, \'Alvaro Arroyo, Johan Obando-Ceron, Pablo Samuel Castro, Aaron Courville, Michael M. Bronstein, Xiaowen Dong 7d ago

A Mechanistic Analysis of Looped Reasoning Language Models

Mechanistic analysis of internal dynamics in looped reasoning language models versus standard feedforward models.

Ax Chenxi Qing, Junxi Wu, Zheng Liu, Yixiang Qiu, Hongyao Yu, Bin Chen, Hao Wu, Shu-Tao Xia 7d ago

C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World Prompts

Benchmark dataset for detecting AI-generated Chinese text with evaluation across multiple LLM architectures.

Ax Ricardo Coimbra Brioso, Lorenzo Mondo, Damiano Dei, Nicola Lambri, Pietro Mancosu, Marta Scorsetti, Daniele Loiacono 7d ago

Budget-Aware Uncertainty for Radiotherapy Segmentation QA Using nnU-Net

Deep learning method for uncertainty quantification in clinical radiotherapy segmentation using budget-aware constraints.

Ax Mihir Prabhudesai, Aryan Satpathy, Yangmin Li, Zheyang Qin, Nikash Bhardwaj, Amir Zadeh, Chuan Li, Katerina Fragkiadaki, Deepak Pathak 7d ago

Solving Physics Olympiad via Reinforcement Learning on Physics Simulators

RL approach for training physics reasoning models on simulators to address lack of large-scale QA datasets in physics domain.

Ax Ryan Saklad, Aman Chadha, Oleg Pavlov, Raha Moraffah 7d ago

Can Large Language Models Infer Causal Relationships from Real-World Text?

Evaluation of LLM causal reasoning capabilities using real-world complex texts with implicit causal relationships.

Ax Zelai Xu, Zhexuan Xu, Xiangmin Yi, Huining Yuan, Mo Guang, Kaiwen Long, Xinlei Chen, Yi Wu, Chao Yu, Yu Wang 7d ago

VS-Bench: Evaluating VLMs for Strategic Abilities in Multi-Agent Environments

Benchmark evaluating VLMs' strategic reasoning abilities in multi-agent environments with multimodal observations.

Ax Ashutosh Hathidara, Julien Yu, Sebastian Schreiber 7d ago

Disambiguation-Centric Finetuning Makes Enterprise Tool-Calling LLMs More Realistic and Less Risky

Three-stage pipeline for disambiguation-centric finetuning of enterprise tool-calling LLMs to reduce errors with near-duplicate tools.

Ax Zhilin Zhang, Xiang Zhang, Jiaqi Wei, Yiwei Xu, Chenyu You 7d ago

PosterGen: Aesthetic-Aware Multi-Modal Paper-to-Poster Generation via Multi-Agent LLMs

Multi-agent LLM system for automated academic poster generation from papers incorporating design and aesthetic principles.

Ax Zonghai Yao, Talha Chafekar, Junda Wang, Shuo Han, Feiyun Ouyang, Junhui Qian, Lingxi Li, Hong Yu 7d ago

ChatCLIDS: Simulating Persuasive AI Dialogues to Promote Closed-Loop Insulin Adoption in Type 1 Diabetes Care

Benchmark and framework for evaluating LLM-driven persuasive dialogue for health behavior change in insulin delivery adoption.

Ax Renqi Chen, Zeyin Tao, Jianming Guo, Jingzhe Zhu, Yiheng Peng, Qingqing Sun, Tianyi Zhang, Shuai Chen 7d ago

RISK: A Framework for GUI Agents in E-commerce Risk Management

GUI agent framework for multi-step e-commerce risk management handling stateful interactions with dynamic web content.

Ax Hehai Lin, Shilei Cao, Sudong Wang, Haotian Wu, Minzhi Li, Linyi Yang, Juepeng Zheng, Chengwei Qin 7d ago

Interactive Learning for LLM Reasoning

Interactive learning approach enabling LLMs to improve reasoning through multi-agent interactions during inference without re-execution.

Ax Yuyang Liu, Chuan Wen, Yihang Hu, Dinesh Jayaraman, Yang Gao 7d ago

TimeRewarder: Learning Dense Reward from Passive Videos via Frame-wise Temporal Distance

Reward learning method deriving progress estimation signals from passive videos for robotics RL tasks without manual reward engineering.

Ax Shaoan Xie, Lingjing Kong, Xiangchen Song, Xinshuai Dong, Guangyi Chen, Eric P. Xing, Kun Zhang 7d ago

Advancing Reasoning in Diffusion Language Models with Denoising Process Rewards

RL method for improving reasoning in diffusion-based language models using denoising process rewards instead of outcome-only rewards.

Ax Wenda Xie, Chao Guo, Yanqing Jing. Junle Wang, Yisheng Lv, Fei-Yue Wang 7d ago

Plug-and-Play Dramaturge: A Divide-and-Conquer Approach for Iterative Narrative Script Refinement via Collaborative LLM Agents

Multi-agent LLM system for iterative narrative script refinement using divide-and-conquer approach to improve long-form creative content generation.

Ax Pengkun Jiao, Yiming Jin, Jianhui Yang, Chenhe Dong, Zerui Huang, Shaowei Yao, Xiaojiang Zhou, Dan Ou, Haihong Tang 7d ago

SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance

RL framework for e-commerce search relevance using stepwise reward optimization to improve LLM-based query-product matching beyond SFT/DPO limitations.