Isolater - Feed

HN JB_5000 2/26/2026

Show HN: ContextUI open sourced – Local first AI workflows for humans and agents

Analysis piece on bottlenecks in exponential AI output growth. Limited technical depth.

HN rsecora 2/26/2026

Programming Without People: Designing a Language for LLMs

Research on designing programming languages optimized for LLM code generation and interaction.

HN jtalk22 2/26/2026

Show HN: Slack MCP Server v2.0.0 (deterministic Slack MCP diagnostics)

Slack MCP Server v2.0.0 with deterministic diagnostics and stable tool contracts. AI agent infrastructure for Slack integration.

HN christalingx 2/26/2026

Show HN: Compression API for LLM prompts (40-60% token savings, ~5ms overhead)

API for compressing LLM prompts achieving 40-60% token savings with minimal overhead.

HN sebg 2/26/2026

Frontier Model Training Methodologies

Article title about frontier model training methodologies. No content for evaluation.

HN Johnene 2/26/2026

I fine-tuned a 14B model to beat GPT-4o at NYT Connections (30% vs. 22.7%)

Fine-tuned 14B model achieving 30% accuracy on NYT Connections puzzle vs GPT-4o's 22.7%. Original ML benchmark result.

HN geox 2/26/2026

AI and the Dream: Technology in the Service of Humanity

LMStudio tool for loading and using local LLM models remotely with end-to-end encryption.

HN zhebrak 2/26/2026

Physics-based simulator for distributed LLM training and inference

Physics-based simulator for distributed LLM training and inference optimization.

HN antonly 2/26/2026

Tell HN: Silent Netcup Domain Registrar DNSSEC Failure

Netcup domain registrar DNSSEC infrastructure failure causing DS record mismatches. Technical but unrelated to AI/ML interests.

LB vgel.me via atharva 2/26/2026

Small Models Can Introspect, Too

Open-source 32B model demonstrates introspection capabilities through logit analysis. Improved prompting enhances performance on detecting injected concepts in activations.

BL 2/26/2026

OpenAI Codex and Figma launch seamless code-to-design experience

OpenAI Codex and Figma integration using MCP standard to enable code-to-design bidirectional conversion. Uses AI agents to interface with external systems.

Ax Georgios Kamaras, Subramanian Ramamoorthy 2/26/2026

A Distributional Treatment of Real2Sim2Real for Object-Centric Agent Adaptation in Vision-Driven Deformable Linear Object Manipulation

Real2Sim2Real framework for deformable linear object manipulation using likelihood-free inference and visual perception for robotic agents.

Ax Kangyu Zheng, Tianfan Fu, Zhiding Liang 2/26/2026

QCS-ADME: Quantum Circuit Search for Drug Property Prediction with Imbalanced Data and Regression Adaptation

Quantum machine learning framework for predicting ADME drug properties using quantum circuit search with imbalanced data handling.

Ax Thomas Kwa, Ben West, Joel Becker, Amy Deng, Katharyn Garcia, Max Hasin, Sami Jawhar, Megan Kinniment, Nate Rush, Sydney Von Arx, Ryan Bloom, Thomas Broadley, Haoxing Du, Brian Goodrich, Nikola Jurkovic, Luke Harold Miles, Seraphina Nix, Tao Lin, Neev Parikh, David Rein, Lucas Jun Koba Sato, Hjalmar Wijk, Daniel M. Ziegler, Elizabeth Barnes, Lawrence Chan 2/26/2026

Measuring AI Ability to Complete Long Software Tasks

Proposes metric measuring AI ability to complete long software tasks by comparing model performance to human domain expert completion time.

Ax Shivasankari Kannan, Yeounoh Chung, Amita Gondi, Tristan Swadell, Fatma Ozcan 2/26/2026

High-Fidelity And Complex Test Data Generation For Google SQL Code Generation Services

Test data generation method for SQL code generation services using high-fidelity synthetic data to model complex data structures and semantic relationships.

Ax Anton Selitskiy, Maitreya Kocharekar 2/26/2026

Discrete Optimal Transport and Voice Conversion

kDOT: discrete optimal transport framework for voice conversion using barycentric projection in pretrained speech embedding space instead of averaging strategies.

Ax Junxiao Yang, Jinzhe Tu, Haoran Liu, Xiaoce Wang, Chujie Zheng, Zhexin Zhang, Shiyao Cui, Caishun Chen, Tiantian He, Hongning Wang, Yew-Soon Ong, Minlie Huang 2/26/2026

BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs

BARREL identifies pathological reasoning patterns in Large Reasoning Models and improves factual reliability. Enables models to admit ignorance instead of confident false answers.

Ax Guodong Du, Zhuo Li, Xuanning Zhou, Junlin Li, Zesheng Shi, Wanyu Lin, Ho-Kin Tang, Xiucheng Li, Fangming Liu, Wenya Wang, Min Zhang, Jing Li 2/26/2026

Knowledge Fusion of Large Language Models Via Modular SkillPacks

Knowledge fusion method for LLMs via modular SkillPacks. Enables efficient cross-capability transfer for multi-task integration, compression, and continual learning.

Ax H. L. Dao 2/26/2026

Hyperbolic recurrent neural network as the first type of non-Euclidean neural quantum state ansatz

First non-Euclidean neural quantum state ansatz using hyperbolic GRU for Variational Monte Carlo approximation of quantum many-body ground states.

Ax Ba-Hien Tran, Van Minh Nguyen 2/26/2026

Highly Efficient and Effective LLMs with Multi-Boolean Architectures

Novel multi-Boolean architecture for weight-binarized LLMs. Framework with multi-kernel Boolean parameters reduces complexity without severe post-training performance loss.

Ax Rulin Shao, Shuyue Stella Li, Rui Xin, Scott Geng, Yiping Wang, Sewoong Oh, Simon Shaolei Du, Nathan Lambert, Sewon Min, Ranjay Krishna, Yulia Tsvetkov, Hannaneh Hajishirzi, Pang Wei Koh, Luke Zettlemoyer 2/26/2026

Spurious Rewards: Rethinking Training Signals in RLVR

Shows RLVR with GRPO can improve LLM mathematical reasoning using spurious rewards with little/no correlation to correct answers, challenging reward signal assumptions.

Ax Aloni Cohen 2/26/2026

Blameless Users in a Clean Room: Defining Copyright Protection for Generative Models

Establishes foundations for provable copyright protection in generative models. Revisits near access-freeness and defines conditions for copyright guarantee.

Ax Zhijiang Tang, Jiaxin Qi, Yuhua Zheng, Jianqiang Huang 2/26/2026

A Comprehensive Benchmark for Electrocardiogram Time-Series

Comprehensive benchmark for ECG time-series data addressing unique characteristics and specialized downstream applications of bioelectrical signals.

Ax Shan Jiang, Pranoy Kovuri, David Tao, Zhixun Tan 2/26/2026

CASCADE: LLM-Powered JavaScript Deobfuscator at Google

CASCADE: hybrid LLM-powered JavaScript deobfuscator at Google combining Gemini coding capabilities with compiler IR transformations for code comprehension.

Ax Zeyu Tang, Alex John London, Atoosa Kasirzadeh, Sarah Stewart de Ramirez, Peter Spirtes, Kun Zhang, Sanmi Koyejo 2/26/2026

Position: Beyond Sensitive Attributes, ML Fairness Should Quantify Structural Injustice via Social Determinants

Position paper arguing ML fairness research should quantify structural injustice via social determinants rather than focusing only on sensitive attributes.

Ax Seyedali Mohammadi, Bhaskara Hanuma Vedula, Hemank Lamba, Edward Raff, Ponnurangam Kumaraguru, Francis Ferraro, Manas Gaur 2/26/2026

Do LLMs Adhere to Label Definitions? Examining Their Receptivity to External Label Definitions

Controlled experiments examining whether LLMs incorporate external label definitions or rely on parametric knowledge. Tests expert-curated, LLM-generated, and perturbed definitions.

Ax Patrick Wienholt, Christiane Kuhl, Jakob Nikolas Kather, Sven Nebelung, Daniel Truhn 2/26/2026

MedicalPatchNet: A Patch-Based Self-Explainable AI Architecture for Chest X-ray Classification

MedicalPatchNet: self-explainable architecture for chest X-ray classification using patch-based independent classification and aggregation for transparency.

Ax Rodrigo M. Carrillo-Larco, Jesus Lov\'on Melgarejo, Manuel Castillo-Cara, Gusseppe Bravo-Rocca 2/26/2026

PeruMedQA: Benchmarking Large Language Models (LLMs) on Peruvian Medical Exams -- Dataset Construction and Evaluation

PeruMedQA benchmark dataset of Peruvian medical exam questions in Spanish. Evaluates LLM performance on non-English, Latin American medical domain tasks.

Ax Lauri Suomela, Sasanka Kuruppu Arachchige, German F. Torres, Harry Edelman, Joni-Kristian K\"am\"ar\"ainen 2/26/2026

Synthetic vs. Real Training Data for Visual Navigation

Investigates sim-to-real gap in visual navigation by comparing simulator-trained and real-world-trained policies. Demonstrates simulator policies can match real-world performance.

Ax Kartik Hegde, Rehana Mahfuz, Yinyi Guo, Erik Visser 2/26/2026

Aligning Audio Captions with Human Preferences

Preference-aligned audio captioning framework using RLHF with CLAP-based reward model trained on human-labeled preferences. Addresses gap between supervised learning and real preferences.

Ax Advik Raj Basani, Pin-Yu Chen 2/26/2026

Diversity Boosts AI-Generated Text Detection

DivEye detector for AI-generated text using diversity metrics. Improves detection of synthetic text while providing interpretability over black-box classifiers.

Ax Shai Zucker, Xiong Wang, Fei Lu, Inbar Seroussi 2/26/2026

Minimax Rates for Learning Pairwise Interactions in Attention-Style Models

Theoretical analysis proving minimax convergence rates for learning pairwise interactions in single-layer attention models. Rate independent of embedding dimension and token count.

Ax Chengshu Li, Mengdi Xu, Arpit Bahety, Hang Yin, Yunfan Jiang, Huang Huang, Josiah Wong, Sujay Garlanka, Cem Gokmen, Ruohan Zhang, Weiyu Liu, Jiajun Wu, Roberto Mart\'in-Mart\'in, Li Fei-Fei 2/26/2026

MoMaGen: Generating Demonstrations under Soft and Hard Constraints for Multi-Step Bimanual Mobile Manipulation

Automated data generation framework for multi-step bimanual mobile manipulation tasks. Uses imitation learning to reduce costly human teleoperation data collection.

Ax Raheem Karim Hashmani, Garrett W. Merz, Helen Qu, Mariel Pettee, Kyle Cranmer 2/26/2026

Multimodal Datasets with Controllable Mutual Information

Framework for generating multimodal datasets with controllable mutual information between modalities. Enables systematic study of MI estimators and multimodal self-supervised learning.

Ax Shayne Longpre, Sneha Kudugunta, Niklas Muennighoff, I-Hung Hsu, Isaac Caswell, Alex Pentland, Sercan Arik, Chen-Yu Lee, Sayna Ebrahimi 2/26/2026

ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality

Largest multilingual scaling laws study with 774 experiments across 400+ languages. Introduces Adaptive Transfer Scaling Law (ATLAS) for monolingual and multilingual pretraining.

Ax Zhimin Chen, Chenyu Zhao, Ka Chun Mo, Yunjiang Jiang, Jane H. Lee, Khushhall Chandra Mahajan, Ning Jiang, Kai Ren, Jinhui Li, Wen-Yun Yang 2/26/2026

Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders

Sequential transducer model for recommendation systems handling ultra-long user histories. Explores memorization with transformer-like architecture at scale.

Ax Georgios Kamaras, Craig Innes, Subramanian Ramamoorthy 2/26/2026

Heuristic Adaptation of Potentially Misspecified Domain Support for Likelihood-Free Inference in Stochastic Dynamical Systems

Likelihood-free inference approach adapting domain support for stochastic systems. Addresses misspecified support in robotics agent deployment scenarios.

Ax NVIDIA, :, Arslan Ali, Junjie Bai, Maciej Bala, Yogesh Balaji, Aaron Blakeman, Tiffany Cai, Jiaxin Cao, Tianshi Cao, Elizabeth Cha, Yu-Wei Chao, Prithvijit Chattopadhyay, Mike Chen, Yongxin Chen, Yu Chen, Shuai Cheng, Yin Cui, Jenna Diamond, Yifan Ding, Jiaojiao Fan, Linxi Fan, Liang Feng, Francesco Ferroni, Sanja Fidler, Xiao Fu, Ruiyuan Gao, Yunhao Ge, Jinwei Gu, Aryaman Gupta, Siddharth Gururani, Imad El Hanafi, Ali Hassani, Zekun Hao, Jacob Huffman, Joel Jang, Pooya Jannaty, Jan Kautz, Grace Lam, Xuan Li, Zhaoshuo Li, Maosheng Liao, Chen-Hsuan Lin, Tsung-Yi Lin, Yen-Chen Lin, Huan Ling, Ming-Yu Liu, Xian Liu, Yifan Lu, Alice Luo, Qianli Ma, Hanzi Mao, Kaichun Mo, Seungjun Nah, Yashraj Narang, Abhijeet Panaskar, Lindsey Pavao, Trung Pham, Morteza Ramezanali, Fitsum Reda, Scott Reed, Xuanchi Ren, Haonan Shao, Yue Shen, Stella Shi, Shuran Song, Bartosz Stefaniak, Shangkun Sun, Shitao Tang, Sameena Tasmeen, Lyne Tchapmi, Wei-Cheng Tseng, Jibin Varghese, Andrew Z. Wang, Hao Wang, Haoxiang Wang, Heng Wang, Ting-Chun Wang, Fangyin Wei, Jiashu Xu, Dinghao Yang, Xiaodong Yang, Haotian Ye, Seonghyeon Ye, Xiaohui Zeng, Jing Zhang, Qinsheng Zhang, Kaiwen Zheng, Andrew Zhu, Yuke Zhu 2/26/2026