Analysis of gender dynamics and homophily patterns in Chirper.ai, a social network of 70K+ autonomous LLM agents generating 140M posts, examining how AI agent identity develops in networks.
Theoretical study of expand-and-sparsify sparse representations for density and mode estimation, analyzing biological sensory system models with random projections and sparsification.
Krause Attention proposes a new transformer attention mechanism addressing representation collapse and attention sink phenomena through bounded normalization inspired by Krause dynamics.
SF-RAG improves retrieval-augmented generation for academic QA by preserving hierarchical document structure instead of flattening papers into chunks, enabling better evidence allocation under token constraints.
Deep reinforcement learning stability improvement using isotropic Gaussian representations to handle non-stationary training dynamics.
Parameter-efficient fine-tuning method using manifold expansion to overcome linear limitations of LoRA in complex reasoning tasks.
Analysis of transformer training dynamics under AdamW optimizer identifying low-dimensional stable drift patterns in parameter evolution.
Cognitive psychology-based study showing LLMs exhibit proactive interference dominance, with early information overriding recent conflicting context.
Benchmark evaluating whether code agents can understand multi-file software architecture through codebase exploration under partial observability.
Analysis of LLM internal representations showing increased sparsity with task difficulty and out-of-distribution shift across contexts.
Domain-specific enhancement of vision-language models for ophthalmic diagnosis by injecting expert knowledge to address perception and reasoning gaps.
Reinforcement learning robustness method using adversarial latent-state training for partially observable environments.
Theoretical analysis connecting drifting models and score-based models through kernel-induced mean-shift discrepancy.
Task and motion planning approach combining scheduling with incremental learning for warehouse automation under resource and motion constraints.
Large-scale distributed training infrastructure for embodied AI using thousand GPUs and LeRobot framework with optimization recipes.
Security vulnerability analysis of LLM multi-agent systems showing inference attacks can extract communication topology without administrative access.
Parameter-efficient fine-tuning method using representation finetuning for continual learning on pre-trained models with explicit optimization dynamics.
Incremental learning framework using vision-language models with multi-adapter fine-tuning to improve efficiency and reduce memory requirements.
Study on decoding emotional affect from surface EMG during speech production using machine learning.
Analysis of safety drift in tool-augmented LLM agents, showing ranking metrics miss unsafe recommendations in high-stakes financial advisor scenarios.
Surgical duration prediction using retrieval-augmented LLMs and Bayesian averaging without fine-tuning, applied to hospital resource management.
Study on improving LLM code generation with private libraries, showing retrieval-based API documentation injection is insufficient for effective library usage.
Spectral Edge Dynamics quantifies transformer training trajectory structure through rolling SVD, identifying boundary between optimization directions and noise.
LICA dataset of 1.55M layered graphic design compositions with hierarchical metadata for layout understanding and generation.
Analysis of multimodal LLM-generated natural language explanations for face verification on unconstrained images using IJB-S dataset.
Survey of deployment constraints and mitigation strategies for foundation models in resource-constrained embodied edge systems.
HopChain improves vision-language reasoning through multi-hop data synthesis to address perception, reasoning, and hallucination errors in VLMs.
SCALE addresses bottlenecks in virtual cell perturbation prediction using foundation models for in silico experimentation.
Multimodal multilingual benchmark with 3000 texts and 6000 images for detecting harmful humor across English and Arabic.
TDAD is open-source tool performing impact analysis for AI coding agents to detect and prevent regressions in test-driven agentic development.
Geometric analysis of Rotary Positional Embedding performance breakdown on long inputs, explaining channel rotation distribution shift.
Architectural approach using per-layer supervision to expose hidden modularity in Transformers, enabling interpretability and causal control of components.
Methods for distinguishing system failures from domain shifts in industrial data streams using anomaly detection techniques.
Graph-regularized Koopman mean-field game framework for controlling high-dimensional neural dynamics during epileptic seizures.
Systematic methodology for fine-tuning domain-specific Japanese small language models, identifying optimal training scale (4k samples), base models, and quantization strategies.
Mathematical framework for comparing multi-agent swarm configurations using quotient geometry and persistence-stable metrics.
Zero-knowledge proof system enabling cryptographic verification that proprietary LLM API outputs come from claimed models, preventing model substitution or degradation.
Investigates how mechanistic interpretability features survive extreme neural network sparsification using adaptive sparsity scheduling in VAE-SAE architectures.
Lightweight adaptation method for LLM-based technical service agents using latent logic augmentation and noise reduction without full retraining.
Variational Phasor Circuit architecture for brain-computer interface classification using phase-native learnable parameters inspired by quantum circuits.
Step-level experience augmented reinforcement learning for multi-turn LLM agents that dynamically retrieve and refine experiences throughout episodes.
Meta-BayFL framework for federated learning with probabilistic approaches to handle data uncertainty and heterogeneity while managing computational overhead.
Proposes dynamic constraints for reinforcement learning fine-tuning that adapt to model capabilities, resolving tension between optimization and constraint satisfaction.
Neuro-symbolic framework combining neural operators with economic constraints for interpretable quantitative finance models respecting no-arbitrage principles.
Tula optimizes distributed large-batch training by balancing communication overhead, computation cost, and model generalization across scaling configurations.
Proposes VC-Soup method for aligning LLMs with multiple potentially conflicting human values through value-consistency guided optimization.
LLM-augmented computational phenotyping framework for discovering clinical subphenotypes in Long COVID through iterative hypothesis generation and evidence extraction.
Framework for detecting conflicts in policy languages that use probabilistic ML predicates, applied to semantic router DSL for LLM routing systems.
Improves PDE surrogate model training through gradient-informed temporal sampling strategies that optimize rollout accuracy under fixed data budgets.
Proposes AGRI-Fidelity framework to evaluate reliability of explainable AI for poultry disease detection in noisy farm environments.