1 year of LLMs writing code for me
Personal account of using AI coding agents as primary development tool for one year. Observes recent quality threshold improvements in 2025.
Personal account of using AI coding agents as primary development tool for one year. Observes recent quality threshold improvements in 2025.
Human motion capture facilities in India providing training data for humanoid robot development.
Block-Level CRDT architecture for managing distributed memory across multiple AI agents.
Microsoft executive comments on potential licensing requirements for AI agents in enterprise software.
Unslop: Browser extension filtering AI-generated content from social feeds using local LLM classification. Privacy-focused, no backend required.
Skillsmith enables writing AI coding skills once for export across multiple AI providers and platforms.
Comparison of OpenAI and Anthropic models.
Music application for browsing Apple Music library as vinyl crates.
CLI AI agent prioritizing data privacy and open-source alternatives to existing coding agents.
Developer replaced custom AI agent dashboard with Fizzy tool.
Microsoft replaces Copilot in Notepad with alternative AI writing tools on Windows 11.
Empirical benchmark study testing whether MCPs (Model Context Protocol) improve coding agent performance on Terminal-Bench 2.0.
Formal: LLM-driven property checker backed by Lean 4. Identifies pure functions, generates properties, creates proofs using Mathlib.
Claude Code skill that aggregates developer RSS feeds and generates daily structured digests filtered by quality.
Ronja is a user-controlled optical point-to-point data link project with 1.4km range and 10Mbps duplex.
Darwin-27B-Opus surpasses foundation models through evolutionary FFN breeding without additional training, achieving high GPQA scores.
ReBot-DevArm is an open-source robotic arm project with full hardware and software stack for embodied AI applications.
Bangen is an ASCII banner renderer built on pyfiglet, rich, and Pillow with TUI, effects, and export capabilities.
Murmure aggregates developer sentiment from Reddit, HN, and forums into weekly intelligence reports on AI coding tools.
BlkBolt technology for AI content attestation and verification using revocable signatures for agent tracking.
Technical guide addressing iframe scrolling limitations in MCP Apps on mobile with design patterns for responsive UIs.
Rust implementation for detecting MITRE ATLAS techniques targeting LLM security and adversarial attacks.
Multica is an open-source platform turning coding agents into managed teammates with task assignment and progress tracking.
Technical analysis of AI pentesting agents evolution from PentestGPT to autonomous agents like PentAGI and XBOW.
Essay on bug bounty trends in 2026. Discusses AI agent effectiveness for vulnerability discovery and program management challenges.
Apache 2.0 open standard for governing AI agent payment requests. Policy engine with 12 configurable checks for payment authorization.
Open-source tax software built and maintained by autonomous AI agents. Uses IRS publications as source, applies self-improving agent loops.
Tool for multi-LLM code review consensus. Aggregates feedback from multiple models to identify blind spots and improve code quality assessment.
Essay on LLM-based knowledge management limitations. Discusses problems with AI-generated note synthesis and cognitive organization.
Agent skill implementation for token compression. Reduces output tokens by ~47% while maintaining readability.
Security report on 1.4M AI-driven API test executions. Maps vulnerabilities to OWASP Top 10 using agentic testing.
Cloudflare expands access to OpenAI's frontier models via Agent Cloud platform, enabling enterprises to deploy AI agents for customer support, system updates, and report generation.
Benchmark evaluating humor alignment across frontier LLMs using Cards Against Humanity gameplay, analyzing model performance vs human baseline on comedic response selection.
InstrAction pretraining framework for video foundation models to improve action recognition in instructional videos by addressing static bias in temporal understanding.
Deep learning method for cardiac MRI imaging using phase-sensitive inversion recovery to reduce acquisition time and motion artifacts in late gadolinium enhancement scans.
eBandit uses eBPF and multi-armed bandit reinforcement learning in Linux kernel for adaptive video bitrate selection with improved network signal visibility.
Evaluates cultural alignment of LLMs across 14 language-culture pairs using multilingual story moral generation task and dataset.
Investigates opportunities for resource-constrained AI research using obsolete yet capable discarded models from AI production cycles.
Workshop report on designing reinforcement learning environments for autonomous cyber defense applications.
SenBen large-scale scene graph benchmark for explainable content moderation with visual grounding and sensitivity annotations.
HiFloat4 low-precision floating-point format for efficient 4-bit LLM pre-training on Ascend NPU hardware.
Dictionary-aligned concept control method for safeguarding multimodal LLMs against malicious queries at inference time.
Constraint-satisfaction-based retrieval system for matching patient profiles to clinical trials with high recall and precision.
Empirical study on how humans allocate responsibility in AI-human hybrid workflows using AI-assisted lending experiments.
AudioGuard framework for comprehensive audio safety protection including voice impersonation, speaker attributes, and compositional harms.
MedFormer-UR transformer with uncertainty quantification for safe medical image classification in clinical settings.
Survival-oriented benchmark for temporal student dropout risk modeling using Open University Learning Analytics Dataset.
Temporal survival modeling framework for predicting student dropout using LMS engagement data and administrative records.
Re-examines capacity gap in chain-of-thought distillation, finding student models often outperform teacher distillation baselines.
HTNav framework for aerial vision-and-language navigation combining visual perception with language instructions in urban environments.