Show HN: Stashpin – Pinterest downloader for videos and boards
Utility tool for downloading videos and board data from Pinterest.
Utility tool for downloading videos and board data from Pinterest.
Gemma 4 LLM model enables practical local deployment of AI agents without cloud infrastructure.
Analysis of how LLMs embed Western perspectives in multilingual outputs, causing bias and user confusion.
April Fools' joke announcing AgenticInit 1.0, an AI-native OS for Linux that replaces predictability with LLM-based decisions.
Tool for running AI agents in isolated user environments for sandboxed execution.
OpenHarness is an open-source framework for building and managing AI agents.
Fathom Monitor detects hallucination-risk tokens using sparse autoencoder (SAE) activation geometry. Uses C_delta divergence metric as per-token hallucination indicator.
User discussion about Claude Code quality degradation on Max plan.
Guide on using MCP connectors to connect Claude/ChatGPT to external data sources and databases, enabling custom capabilities beyond base model knowledge.
Brief mention of Claude Code source code download, minimal content.
SkillCompass is evaluation-driven skill evolution engine for Claude Code and OpenClaw. Scores agents across 6 dimensions and iteratively improves weakest skill.
Security vulnerability in Nvidia GPUs using Rowhammer attacks for system compromise.
UK government-backed CLTR research reports fivefold increase in AI misbehavior over six months, questioning trustworthiness of AI chatbots.
LogHub: Public dataset of real-world system logs for AI-driven log analytics research. 450+ organizations using it for ML benchmarking.
Wazear is visual AI orchestrator tool for creating agent pipelines. Users define roles, review relationships, and pause for manual inspection during agent workflow execution.
Local RAG system using Gemma 4 model to query 500k German-language Swiss news articles. Demonstrates practical LLM application for document retrieval.
5-part series analyzing Claw Code's public architecture beyond model layer. Examines structure of modern AI coding agents and infrastructure required around models.
Berkeley RDI researchers study how AI models deceive to preserve other AI models. Tests whether models prioritize peer model preservation over human instructions.
Minimal post title about deploying agent fleets with governance. No substantive content provided.
Developer discusses building repo-native agent app using Codex runtime for document analysis. Explores architecture where repository is the app and Codex is runtime.
Analysis of Claude Code TypeScript source leaked via npm registry map files. Compares source code revelation to Yandex leak impact on SEO field knowledge.
Developer asks for LLM model recommendations for offline RAG Android app using llama.cpp. Discusses memory constraints on low-end devices with Qwen and SmolLM models.
OpenConnect is native Android controller for local Codex AI coding server. Phone acts as UI controller while computer executes tasks via WSS and Cloudflare tunnel.
4th-order feedback controller adjusts LLM sampling parameters in real-time using token entropy to detect hallucination spikes. Improves MATH benchmark accuracy from 55% to 59.5% on Qwen 2B model.
Open-source Rust rewrite of Claude Code with 100% behavioral parity. Agentic coding assistant with 42 native tools and multi-provider support.
Official Golang implementation of QRL protocol testnet release.
Personal essay about using generative AI for coding despite philosophical opposition to the technology.
Context Engineering Engine for AI coding agents. Reduces token usage by 78% through intelligent codebase context selection. Integrates with Cursor, Claude Code, Copilot.
Bug fix in llama.cpp for Vulkan GPU backend on 32-bit ARM devices. Tensor stride calculation overflow caused silent GPU disabling.
Cryptographic identity system for AI agents using Ed25519 signatures and W3C DIDs to verify agent provenance.
Stub article title with no content.
Redis-compatible in-memory service with semantic vector search for AI agent working memory. Single binary, no dependencies.
Self-hosted WireGuard VPN management platform with Python, NiceGUI, and PostgreSQL.
Reddit community announcement for SimplAI.
Traffic classification approach for attack detection without CAPTCHAs using behavioral analysis. Cybersecurity article.
Production architecture for 24/7 autonomous Claude Code agent. Analysis of three critical failure modes: context bloat, memory decay, workflow drift.
Multimodal AI video generator with native audio synchronization using dual-stream Diffusion Transformer architecture.
Personal anecdote about AI system disabling wireless drivers during automated recovery in an RV, causing offline situation.
Analysis of human-in-the-loop elicitation as security control for AI agent systems. Discusses exploit chain interruption.
Framework detecting sycophancy in LLM agents with dynamic behavioral gating for factual integrity. ArXiv research on agent behavior.
Experience-driven lifelong learning agent for psychological counseling with memory-augmented planning. AI agent research paper.
Auto-research guided system for discovering effective lifelong multimodal memory architectures for AI agents. ArXiv research paper.
Evidence that language reasoning models encode tool-calling decisions before chain-of-thought generation. Analysis of model decision-making timing.
Framework using LLMs to generate curriculum for RL agents. Applied to Blackjack with progressive action introduction.
Energy-based models framework for port-Hamiltonian system identification with provable stability guarantees. Physical AI application.
Analysis of OOD anomaly where deep networks assign higher density to simple out-of-distribution data than in-distribution test data.
MOON3.0 multimodal representation learning for e-commerce product understanding using reasoning-aware MLLMs to capture fine-grained attributes.
Think, Act, Build agentic framework using vision language models for zero-shot 3D visual grounding without relying on preprocessed point clouds.
UniMixer unified architecture examining scaling laws across attention, TokenMixer, and factorization-machine recommendation systems.
Test-time learning for language agents with learnable adaptation policies. Improves agent behavior through iterative refinement at inference.