Entropy-Driven Curriculum for Multi-Task Training in Human Mobility Prediction
Entropy-driven curriculum learning approach for multi-task human mobility prediction from mobile device data.
Entropy-driven curriculum learning approach for multi-task human mobility prediction from mobile device data.
Optimal transport-enhanced graph networks for aspect-based sentiment analysis using syntactic-semantic structures.
Multi-view diffusion policy for coordinated mobile manipulation control with manipulability awareness in unstructured environments.
Robotic skill composition using scene graphs for generalist robots to solve complex tasks with distribution shift robustness.
Single-image implicit surface reconstruction for robotics obstacle avoidance and motion generation.
Surrogate-free multi-agent reinforcement learning framework using generative models instead of explicit policy populations.
Active learning method for correlation clustering in cold-start settings without initial pairwise similarity data.
Transformer architecture using cross-state transition attention for robust robotic manipulation from demonstrations.
Prompting protocol combining objection-raising and revision mechanisms to improve LLM reasoning and self-correction.
Multi-turn red-teaming approach using tree-based dialogue and reinforcement learning for discovering LLM vulnerabilities.
Scalable methods for computing Wasserstein barycenters of probability measures via gradient flows.
Hardware-software co-design framework for efficient multimodal model inference on battery-powered edge devices.
Membership inference attacks on LLM tokenizers as privacy attack surface distinct from model attacks.
Backdoor attack on vision-language-action models demonstrating action-level behavioral manipulation vulnerabilities.
World model and MPC framework for humanoid robot contact planning combining learned representations with sampling-based control.
Open-source corpus and tools for training fully open multimodal LLMs with improved data quality and reasoning.
Study on unintended reasoning behaviors in reinforcement-learning-trained LLMs and chain-of-thought monitoring.
Continual learning method for audio-visual segmentation addressing modality entanglement in sequential tasks.
Framework enabling LLMs to perform tabular prediction via structural priors and reasoning-focused optimization.
Evaluates driving world models as synthetic data generators for autonomous vehicle perception tasks.
Transformer framework for class-agnostic object counting using visual repetition patterns.
Navigation system using 3D Gaussian Splatting memory for multi-modal visual goal navigation in robotics.
SwiftEmbed: production text embedding system achieving 1.12ms latency and 50k req/s using static token lookup in Rust.
Research on vectorized online POMDP planning for autonomous robot decision-making under partial observability with parallelization.
Research on detecting AI-generated images via diffusion model snap-back reconstruction forensics. Addresses Stable Diffusion and DALL-E detection.
Comparative study of interpretable fuzzy reasoning vs deep learning for motor-imagery EEG classification in brain-computer interfaces.
Research paper on federated learning of mixture-of-experts models for mobile edge computing and resource-constrained devices.
FATE benchmark series for formal algebra theorem proving at multiple difficulty levels. Evaluates LLM capabilities on mathematical reasoning beyond contest problems.
Detection method for AI-generated images using contextual anomaly estimation in masked autoencoders. Extends DetectGPT approach from text to vision domain.
HatePrototypes: Interpretable representations for hate speech detection covering implicit and explicit hate. Addresses content moderation with transferable embeddings.
UnfoldLDM combines deep unfolding networks with latent diffusion models for blind image restoration. Model-based interpretable approach to image processing.
Probabilistic certification framework improving SmoothLLM defense against LLM jailbreaking attacks. Addresses robustness guarantees with realistic assumptions.
Yo'City: Agentic framework using self-critic expansion for personalized, boundless 3D city generation. Demonstrates AI agent reasoning in creative generation tasks.
Transformer model with physics-inspired attention masks for neutrino reconstruction in KM3NeT/ORCA telescope. Domain-specific deep learning application.
Automated pipeline for generating multi-turn conversational jailbreak attacks against LLMs using psychological principles like FITD without manual dataset creation.
Contrastive learning approach for adapting foundation models to domain-specific tasks in Earth observation without full retraining.
Protein inverse folding method combining retrieval-augmented approaches with denoising diffusion for amino acid sequence design from protein structures.
Deep learning pipeline for automated foraminifera species classification from micro-CT scans using a dataset of 27 species across 12 representative classes.
AltNet addresses plasticity loss in RL-trained neural networks via parameter reset strategies. Research on continual learning for RL agents.
arXiv paper on evaluating agentic systems via process-centric analysis of trajectories and reasoning patterns rather than outcomes alone. Foundational agent analysis framework.
Shapley value extension for nonlinear feature attribution and explainability. XAI research not specific to LLMs or agents.
Deep learning method for off-road vector extraction from geospatial data. Computer vision research unrelated to user interests.
SALVE framework for neural network interpretability and control using sparse autoencoders. Mechanistic interpretability research not focused on LLMs or agents.
LaMer: Meta-RL framework enabling LLM agents to actively explore and learn from trial-and-error in multi-turn tasks. Research on agent training methodology.
Test-time depth refinement framework combining depth estimation with diffusion models. Computer vision research unrelated to user interests.
arXiv paper analyzing cost trade-offs between reasoning and non-reasoning LLMs for Text-to-SQL tasks on cloud platforms. Empirical efficiency comparison.
DrivingGen benchmark for generative video world models in autonomous driving. Research on agent simulation and synthetic data generation.
NC-Bench: arXiv benchmark evaluating LLM conversational competence on form/structure vs content. Research paper on LLM evaluation methodology.
Audit of LAION-Aesthetics Predictor studying whose aesthetic values are embedded in visual generative AI training datasets.
POI recommendation system using hypergraph learning to capture mobility variations across contextual scenarios in location-based social networks.