HN bellamoon544 8d ago

Whisk AI

Google Labs Whisk AI: free image generator that blends three visual inputs (subject, scene, style) using Gemini and Imagen 3.

HN teichmann 8d ago

An AI Vibe Coding Horror Story

Case study of AI vibe coding failure: non-technical person built faulty patient management system instead of using proven solutions.

HN ywian 8d ago

Trace your Claude Code easily

Desktop and web viewer for Claude Code session logs with expandable tool calls and token tracking, built with Tauri and React.

HN zagwdt 8d ago

Introspective Diffusion Language Models

Introduces Introspective Diffusion Language Models (I-DLM) using strided decoding to improve parallel token generation quality versus autoregressive models.

Ax Nicolas Rodriguez-Alvarez (Instituto de Educacion Secundaria Parquesol, Valladolid, Spain), Fernando Rodriguez-Merino (University of Valladolid, Valladolid, Spain) 8d ago

Fairness is Not Flat: Geometric Phase Transitions Against Shortcut Learning

Geometric methodology to mitigate shortcut learning and demographic bias in deep neural networks through topological constraints.

Ax Hongli Zhan, Emma S. Gueorguieva, Javier Hernandez, Jina Suh, Desmond C. Ong, Junyi Jessy Li 8d ago

Discourse Diversity in Multi-Turn Empathic Dialogue

Analysis of discourse diversity in multi-turn empathic dialogue, examining LLM formulaicity beyond single-turn settings.

Ax Quanyi Li, Lan Feng, Haonan Zhang, Wuyang Li, Letian Wang, Alexandre Alahi, Harold Soh 8d ago

Grounded World Model for Semantically Generalizable Planning

Grounded world models for visuomotor planning using pretrained vision encoders, enabling semantic generalization without explicit goal images.

Ax Hugh Blayney, \'Alvaro Arroyo, Johan Obando-Ceron, Pablo Samuel Castro, Aaron Courville, Michael M. Bronstein, Xiaowen Dong 8d ago

A Mechanistic Analysis of Looped Reasoning Language Models

Mechanistic analysis of internal dynamics in looped reasoning language models versus standard feedforward models.

Ax Hehai Lin, Shilei Cao, Sudong Wang, Haotian Wu, Minzhi Li, Linyi Yang, Juepeng Zheng, Chengwei Qin 8d ago

Interactive Learning for LLM Reasoning

Interactive learning approach enabling LLMs to improve reasoning through multi-agent interactions during inference without re-execution.

Ax Weihua Cheng, Junming Liu, Yifei Sun, Botian Shi, W Yirong Chen, Ding Wang 8d ago

MGA: Memory-Driven GUI Agent for Observation-Centric Interaction

MGA memory-driven GUI agent reduces context overload and architectural redundancy by managing sequential trajectory history for improved long-horizon end-to-end automation.