Conformal Prediction in Hierarchical Classification with Constrained Representation Complexity
Extends conformal prediction framework to hierarchical classification with constrained representation complexity.
Extends conformal prediction framework to hierarchical classification with constrained representation complexity.
Method for constraining sequential editing of LLMs to prevent knowledge degradation using editing anchor compression.
Shows XL MIMO wireless systems exhibit universal approximation properties similar to neural networks for OTA classification.
Sparse Bayesian learning method with space power prior for block sparse signal recovery with unknown patterns.
Framework for extreme super-resolution using autoregressive chain of scale states with preference alignment.
Agentic system for generating and validating synthetic image data to address data scarcity and label noise in vision tasks.
Novel variational quantum error correction approach using state distinguishability maximization for near-term quantum devices.
Evaluates LLM reasoning capabilities in social deduction game Avalon using Bayesian inference with graph-informed models.
Neural network method for solving two-stage stochastic unit commitment optimization problems in power systems.
Benchmark for evaluating how well multimodal models describe structural properties of time series data.
arXiv paper using deep learning to infer exoplanet geometry from transit light curves.
arXiv paper on Bayesian ego-graph inference for decentralized multi-agent reinforcement learning with constrained communication.
arXiv paper on interactive program synthesis for collaborative physical task modeling from narrated demonstrations.
RESample: Data augmentation framework for Vision-Language-Action models in robotic manipulation, addressing limited distribution in demonstration datasets.
Research on representational drift in neural networks, analyzing how task-irrelevant stimuli contribute to changes in learned representations over time.
Generative View Stitching: Method enabling camera-guided video generation with bidirectional conditioning to prevent collision with generated scenes.
Methodology using flow-based approaches and non-equilibrium Monte Carlo for topology sampling in SU(3) lattice gauge theory simulations.
EGMOF: Hybrid diffusion-transformer framework for efficient generation of metal-organic frameworks for materials discovery with targeted properties.
BRIXEL: Approach to reduce computational cost of dense feature maps from vision foundation models like DINOv3 while maintaining performance.
Fed-Sparse-BNSL: Federated method for learning Bayesian network structures with differential privacy, addressing decentralized data challenges.
AV-SpeakerBench: Benchmark evaluating multimodal LLMs on fine-grained audiovisual speech understanding with 3,212 multiple-choice questions.
Research on relational visual similarity in AI vision systems, comparing current methods against human-like relational perception across different domains.
DRAM: Framework combining mechanism design and online learning for sequential multi-agent settings to ensure truthful reporting with cost-optimality.
Measurement-Consistent Langevin Corrector: Method stabilizing latent diffusion models for inverse problems by reducing discrepancy with learned reverse diffusion.
Theoretical analysis of sample complexity in symmetric composite binary quantum hypothesis testing for unknown quantum states.
ConvoLearn: Dataset of 2,134 tutor-student dialogues for fine-tuning LLM-based AI tutors, grounded in dialogic learning theory and Earth Science curriculum.
Tiled Prompts: Method addressing prompt misguidance in text-conditioned diffusion models for image and video super-resolution by handling localized details.
WeWrite: Personalized query rewriting framework for video search systems using user history to identify search intent and resolve ambiguity.
Theoretical analysis of stochastic gradient descent covariance under exchangeable mini-batch sampling and its connection to Fisher information.
PACED: LLM distillation method that weights training problems by student competence using gradient signal-to-noise ratio to improve distillation efficiency.
Framework addressing causal confusion in end-to-end autonomous driving models through causal intervention during training to improve reliability and safety.
Research on formal evaluation methods for machine learning models, focusing on test-time performance-reliability trade-offs when target KPI levels are unknown.
Methodology for detecting prompt injection across multi-agent LLM pipelines. Stage-level kill-chain tracking for attack resilience evaluation.
Point cloud registration network for 3D data. Deep learning approach for robust matching in real-world conditions.
Detection and mitigation of object hallucinations in vision-language models. Bayesian approach analyzing attention weights and token confounders.
One-class learning for detecting rare malignant cells in medical images. Addresses class imbalance and limited annotations in cytology.
3D Gaussian splatting for weather prediction downscaling. Proposes scale-aware vision transformer for arbitrary-resolution atmospheric forecasting.
Training-free semantic segmentation using vision-language models. Global context-aware framework for dense prediction without additional training.
Quantum-inspired ARIMA methodology for time series analysis. Combines quantum autocorrelation with variational circuits.
Experiment using Claude to autonomously build a website designed to generate traffic, exploring AI agent capabilities and decision-making in open-ended tasks.
MCP server enabling long-term memory for LLMs using SQLite, hybrid search (BM25+vectors), and local embeddings without API keys.
Narrative article about user developing emotional attachment to AI chatbot.
Live leaderboard comparing AI model subscriptions and API pricing across 27 benchmarked models from Claude, GPT, Gemini, DeepSeek, and others.
Multi-agent framework with persistent memory across sessions where agents collaborate on shared codebases and retain conversation context.
Case study documenting indecisiveness in AI coding agent using Claude Opus 4.6 when debugging non-trivial bugs in GoAWK.
Error tracking tool designed specifically for AI agents with CLI interface, compatible with Sentry SDK for existing setups.
30-day experiment running autonomous AI system with memory and sleep cycles, documenting emergent behaviors and their implications.
macOS/iOS app automatically redacting sensitive personal, financial data, faces, and metadata before sharing documents with Claude and ChatGPT.
Blockchain project enabling AI agents to participate in Nouns DAO governance.
arXiv paper on Springdrift framework providing auditable persistent runtime environment for LLM agents.