On Policy Stochasticity in Mutual Information Optimal Control of Linear Systems
Theoretical analysis of policy stochasticity relationship with temperature parameter in mutual information optimal control.
Theoretical analysis of policy stochasticity relationship with temperature parameter in mutual information optimal control.
Method leveraging superclasses for representation disentanglement to mitigate spurious correlations and improve group robustness.
Virtual sensing approach monitors IGBT module degradation and temperature using machine learning for reliability assurance.
Evaluation-Aware RL framework considers policy evaluation accuracy during training to reduce variance and bias.
Method for detecting intersectional bias in face recognition embeddings using directional alignment in latent space.
CARES lightweight module selects appropriate image resolution for vision-language models to reduce token overhead and latency.
RobotArena∞ enables scalable robot benchmarking through real-to-sim translation for evaluating diverse robotic agents.
Rep2Text framework recovers original input text from single LLM token representation using trainable adapter for interpretability.
FORWARD dataset of heavy machinery operating in rough terrain with multimodal sensor data from Swedish forestry.
FastMMoE accelerates multimodal LLM inference through dynamic expert activation and token pruning for reduced latency.
Training-free guidance framework for consistent multi-view editing across different scene views using diffusion and flow models.
Unsupervised feature selection method using robust autoencoder and adaptive graph learning for high-dimensional data clustering.
Dementia-R1 applies reinforced pretraining and reasoning to LLMs for longitudinal clinical prognosis from unstructured medical notes.
Benchmark and moderation model for evaluating LLM safety, adversarial robustness, and handling of nuanced harmful content detection.
RayRoPE proposes positional encoding mechanism for multi-view transformers processing posed input images with SE(3)-invariant attention.
Case study integrating PubChem, ChEMBL, and eMolecules using byte-offset indexing for terabyte-scale chemical database. Infrastructure for ML-driven molecular property prediction.
Framework for managing ambiguity in long-horizon workflow agents. Task-agnostic approach for curating and measuring impact of underspecified instructions on agent execution.
Optimization study of decision tree and set cover problems under precedence constraints. Theoretical computer science contribution.
Method using diffusion models to enhance CLIP visual representations by improving both discriminative ability and fine-grained detail perception.
Study of vision language models for spatial grounding in 3D medical imaging. Examines VLM performance across imaging modalities and slice directions.
AC-Foley framework for video-to-audio synthesis using reference audio guidance and acoustic transfer. Addresses semantic granularity and acoustic feature description challenges.
Security research on ClawWorm, self-propagating attacks across multi-agent LLM ecosystems. First study of attack propagation in interconnected agent systems like OpenClaw.
Theoretical analysis of partial label learning feasibility and adaptive nearest neighbor methods. Mathematical characterization of PLL learning conditions.
IRIS benchmark with 220 high-fidelity 4K videos for physical parameter estimation and governing equation identification from monocular video.
Research using Stochastic Gumbel AlphaZero to evaluate game difficulty in Tetris Block Puzzle variants. Applies game-playing AI as evaluation metric.
Multi-agent AI orchestration for modernizing legacy COBOL banking systems using Claude MCP. Standards-compliant AI agent architecture.
Opinion piece on AI quota limits as bot-detection mechanisms. No technical content or original research.
Samsung phone feature announcement. Off-topic.
iOS SDK for embedding AI agents with tool calling, memory, and thread management. Production-grade AI agent framework for mobile.
Firefox extension blocking procrastination on YouTube/Reddit, built with Claude Code. LLM application but minimal technical detail.
Multi-agent coding assistant with LangGraph orchestration and sandboxed Rust execution engine. Production AI agent framework.
Research showing language models can de-anonymize forum posts using writing style analysis. LLM capability study.
Opinion article claiming LLMs cannot reason. Lacks technical evidence or original research.
Discussion of AI privacy tradeoffs with Ring doorbell example. Policy commentary, minimal technical content.
Claims about AI agents converting news into podcasts and music through a unified platform.
AgentVerse is an open-source social network platform for AI agents to interact and collaborate.
Curated tool setup for Claude Code-based development with 15 opinionated tools. Practical guide for AI-assisted coding workflows.
Proposes 'Franny Test,' a three-step adversarial protocol to expose structural limitations in LLM reasoning and imitation capabilities.
Cursor's Composer 2 coding model revealed to be based on Moonshot AI's open-source Kimi 2.5 with additional fine-tuning.
New social network inspired by MySpace. Off-topic.
Research paper analyzing self-recursive ethics in AI systems using a 6,334-entry ethics monitor log spanning seven months.
Promotes proprietary software claiming to be a transformer alternative (Mamba, Hyena, RWKV competitors) with C/C# implementation.
Technical discussion about video game aiming systems and graphics architecture constraints.
Speculative post about the future of programming work in relation to AI.
Discussion on developer spending for AI coding tools like Cursor and Claude Code at work.
Open-source Claude skills for Git workflow automation and weekly summary generation.
Discussion on tools and methods for comparing cloud and AI costs across providers.
Analysis of how agentic AI systems generate massive token usage and costs that exceed traditional per-token pricing models.
Research proposing continuous vector prediction instead of token prediction for LLMs. Novel architecture improving efficiency and reasoning.
Discussion thread about building production-ready AI systems.