Agents of Chaos: Breaches of trust in autonomous LLM agents
ArXiv paper on security/trust issues in autonomous LLM agents (abstract only, content truncated).
ArXiv paper on security/trust issues in autonomous LLM agents (abstract only, content truncated).
Essay de-anthropomorphizing AI agents, framing them as search/utility tools rather than thinking entities.
Cybersecurity resource organizing vulnerable VMs by attack techniques for CTF learners.
System that reads research papers across domains to generate cross-domain hypotheses. Early stage with three discoveries published.
Overview of embodied AI systems that perceive, adapt, and act in physical environments beyond code-based systems.
Opinion piece critical of AI existential risk advocacy and doomerism narratives.
Discussion comparing local LLMs versus API-based and subscription models. Addresses whether local models can match frontier-quality AI.
Production experience thread comparing agentic search versus RAG. Community shares transition triggers, breakages, and hybrid approaches.
Production experience thread comparing agentic search versus RAG. Community shares transition triggers, breakages, and hybrid approaches.
Next.js middleware serving clean Markdown instead of HTML to AI agents, reducing token waste from boilerplate by 2-5x for better LLM performance.
Pi: minimal terminal coding agent with TypeScript extensions, npm packages, and multiple modes (interactive, RPC, SDK, JSON output).
iOS app using video analysis to convert workout footage into structured fitness routines for users 35-55.
Landing page for AI agent product that automates email, meetings, and task management with positioning for founders and managers.
Claude Code hook that reminds users to sleep during bedtime by injecting context reminders and logging violations. Open source tool.
Guide on using Claude Code with Figma for AI-driven product design workflows, converting AI-generated code to editable design files.
Pragmatica Aether is distributed Java runtime for JVM applications with clustering and auto-scaling, alternative to Kubernetes. Open source.
Scheme-langserver provides language server protocol support for Scheme/Lisp with goto-definition, auto-completion, and type inference.
Opinion piece on AI founders being caught in building momentum despite doubts about industry direction and priorities.
Trolley bundles TUI executables with terminal emulator runtime for distribution to non-technical users on Linux/macOS. Pre-alpha stage.
PCA-VAE replaces vector quantization with differentiable online PCA bottleneck via Oja's rule, eliminating codebook collapse and straight-through estimators.
Trustworthy Unified Explanation framework for interpreting LLM reasoning, revealing stability and systematic failure mechanisms across instances.
Generative recommendation framework using multi-modal LLMs to mine deep multi-interests beyond shallow behavioral signals for semantic ID prediction.
Framework for calibrating AI benchmark performance against world population baselines to provide human-anchored capability scales.
Membership inference attacks on ML models using model extraction in label-only settings without access to confidence scores or shadow models.
Theoretical analysis of gradient descent convergence rates for separable logistic regression under large step sizes and unstable regimes.
Deep learning method for antibody sequence engineering using phylogenetic models to capture evolutionary dynamics in affinity maturation.
Theoretical analysis of neural network training complexity under Real-RAM vs bit-model computation, proving ERM for simple networks is ∃ℝ-complete.
Proposes Active Data Reconstruction Attack (ADRA) for detecting LLM training data through active model manipulation rather than passive membership inference.
Applies generative RL to inverse lithography for semiconductor manufacturing mask synthesis, replacing deterministic approaches with conditional sampling.
Analyzes model collapse in image generative models through iterative feedback loops using Markovian framework, revealing neural resonance phenomena in latent space.
Addresses intransitive preferences in LLM fine-tuning via preference learning, proposing methods to handle cyclic preference conflicts in multi-objective optimization.
Inverse distillation for diffusion language models. Accelerates discrete diffusion models for faster text generation inference.
RKHS representation theory for algebraic convolutional filters using integral operators. Signal processing framework for continuous models.
Analysis of sign-based optimizers for adversarial attacks. Studies attack stability and transferability using decaying step sizes.
Dynamic sample pruning for spatio-temporal forecasting. Optimizes training data efficiency for deep learning on large datasets.
Robust Bayesian random feature regression with contaminated priors. Studies double descent phenomenon under model misspecification.
Influence functions for detecting labeling bias in datasets. Addresses fairness issues from biased data collection.
Test-time learning method for causal structure discovery from interventional data. Combines test-time training with causal inference.
Celo2 learned optimizer with improved meta-generalization. Aims for practical adoption of learned optimization rules beyond hand-designed optimizers.
Analysis of how transformers learn sparse attention patterns incrementally. Studies information integration from multiple past positions.
Virtual Parameter Sharpening for inference-time reasoning enhancement. Dynamic low-rank perturbations for transformer adaptation without persistent parameters.
Theoretical analysis of realizable online regression under metric-like losses. Studies ReLU networks in adversarial setting.
Adaptive problem generation via symbolic representations for training small open-weight LMs on math tasks. Data generation using RL with verifiable rewards.
Federated learning approach for financial crime detection handling hybrid data distributions. Privacy-preserving collaborative ML.
Dynamic rollout allocation and policy optimization for LLM reasoning with verifiable rewards. Improves RL training efficiency for reasoning tasks.
Combinatorial interpretability framework for understanding knowledge persistence in unlearning. Studies how information is retained in foundation models.
Evaluation of SAP's RPT-1 tabular foundation model on enterprise data. Compares in-context learning vs traditional ML on structured datasets.
Gradient-based optimization for explainable neuro-fuzzy systems. Addresses accuracy-explainability tradeoff in fuzzy AI models.
RL-steered graph diffusion for neural architecture search. Uses reinforcement learning to guide generative models for DAG-based NAS.
Research on spectral bias in physics-informed neural networks and KANs for solving PDEs. Analysis of learning dynamics and mitigation strategies.