TME-PSR: Time-aware, Multi-interest, and Explanation Personalization for Sequential Recommendation
Sequential recommendation model integrating time-aware, multi-interest, and explanation personalization for personalized recommendations.
Sequential recommendation model integrating time-aware, multi-interest, and explanation personalization for personalized recommendations.
Instruction Hierarchy in LLM Agents arXiv paper addressing multi-source conflicting instructions in LLM systems. Examines privilege levels for safe instruction following.
ECHO arXiv paper on one-step diffusion model for chest X-ray report generation. Compresses multi-step denoising to single parallel generation step.
SafeAdapt arXiv paper on provably safe policy updates in deep RL for non-stationary environments. Addresses safety preservation during policy changes.
Attack method demonstrating model poisoning vulnerabilities in federated learning without requiring collusion between adversarial clients.
Post-training approach enabling LLMs to effectively retrieve and use long-context information for improved reasoning capabilities.
BERT-based evaluation method for LLM outputs that addresses limitations of rigid lexical evaluation and formatting-dependent scoring.
Agentic system for visual retrieval-augmented generation with iterative search and multi-step reasoning across visually rich documents.
Theoretical framework showing how agents with different computational capacities can develop distinct semantic alphabets for communication.
Method for predicting future scene evolution by modeling uncertainty and simulating trajectories rather than dense pixel-level changes.
Technique for decoupled confidence calibration in large vision-language models to reduce hallucinations and improve reliability.
Approach using synthetic images to improve visual perception capabilities in vision-language models for spatial reasoning tasks.
Method for robust prompt learning in vision-language models that leverages visual content to handle label noise effectively.
Framework for training models to make decisions dependent on evidence quality rather than weak supervision in evidence-grounded reasoning tasks.
Research using weight pruning to identify unified mechanisms underlying harmful content generation in aligned LLMs across different domains.
Framework enabling LLMs to learn complex game strategies through self-reflection on expert and self-generated experiences in StarCraft II.
Study evaluating LLM performance on social reasoning tasks in the Avalon game, testing inference capabilities and model distillation effects.
Framework using reinforcement learning integrated with EDA tools to optimize Verilog RTL code generation for hardware efficiency and correctness.
Research examining whether Prospect Theory accurately models LLM decision-making under linguistic uncertainty and epistemic markers.
Interactive program synthesis system enabling users to teach collaborative physical tasks through narrated demonstrations with interpretable corrections.
Chain-in-Tree framework optimizes LLM tree search by selectively branching instead of exhaustive expansion, improving efficiency for long-horizon reasoning.
Framework using anonymization to reduce identity-driven bias in multi-agent debate systems where LLM agents exchange reasoning.
AlphaCast framework combines human expertise with LLM reasoning for iterative time series forecasting with domain knowledge integration.
Adversarial wearable using thermochromic dyes to evade AI surveillance systems by creating thermal-visual misdirection.
EchoTrail-GUI framework enables GUI agents to build actionable memory from past experiences using critic-guided exploration to improve performance and generalization.
Neuro-symbolic deep reinforcement learning approach integrating background knowledge to improve sample efficiency and generalization in RL agents.
Multi-agent path replanning algorithm that efficiently handles delayed agents by precomputing solutions using temporal flexibility to avoid cascading conflicts.
Study showing large reasoning models may not report how input hints influence reasoning, with implications for interpretability and security.
ConvoLearn dataset of 2,134 dialogues for fine-tuning dialogue tutors grounded in knowledge-building theory and learning sciences.
Analysis of AI model failure modes: systematic misalignment vs. nonsensical actions across varying task complexity and model intelligence.
NLCO benchmark evaluating LLM reasoning on natural-language combinatorial optimization with hard constraints and high-dimensional search spaces.
Hospital administrative workflow simulator with FHIR integration for testing LLM-based automation in realistic multi-agent scenarios.
Benchmark evaluating LLM agents on replication of scientific papers with incomplete data, capturing real-world research challenges.
LLM distillation method weighting problems by student competence gradient signal-to-noise ratio for efficient training.
Framework for analyzing autonomous AI agent reasoning behavior through structured behavioral analytics beyond execution traces.
Machine unlearning method for multimodal recommendation systems using targeted reverse updates for efficient data deletion.
Production agentic system for cloud outage management with real-time updates, knowledge distillation, and conditioned action recommendations.
Domain-scoped inference architecture with explicit domain as computational parameter enabling substrate-independent reasoning.
Memory system for deep research agents enabling efficient evolution and reasoning through intelligent trajectory memory management.
Dual-LLM framework for zero-shot human mobility trajectory synthesis from activity descriptions without historical data.
Lightweight agent benchmark with configurable evaluation metrics addressing environment overhead and task distribution imbalances.
Framework and benchmark for deep research agents using structured knowledge alongside unstructured web content for comprehensive reports.
Multi-model orchestration framework for verifier-free evolutionary inference balancing diversity and computational efficiency.
Query and evidence processing tools (Q+) to improve deep research agents with structured reasoning, reducing redundant exploration.
Multi-agent system for automated industry classification using multimodal data and geographic information without manual annotation.
RL agents using language-conditioned transfer for zero-shot generalization to new tasks via analogical semantic policies.
Data-free meta-learning from pre-trained models without original training data, analyzing robustness and failure modes.
Transfer learning framework for optimizing traffic through real-time driving advisories to human drivers in connected and automated vehicle systems.
Multi-agent reinforcement learning approach using coordination graphs to model higher-order group relationships beyond pairwise agent interactions.
Survey of methods for detecting and characterizing coordinated online behavior in social media, from community dynamics to disinformation campaigns.