RedTopic: Toward Topic-Diverse Red Teaming of Large Language Models
RedTopic framework for topic-diverse red teaming of LLMs to identify vulnerabilities across broad range of harmful topics adaptively.
RedTopic framework for topic-diverse red teaming of LLMs to identify vulnerabilities across broad range of harmful topics adaptively.
MS-DGCNN++ applies multi-scale dynamic graph convolution with scale-dependent normalization for LiDAR tree species classification.
Privacy-preserving graph structure learning framework for publishing open graph data with differential privacy guarantees.
Derives generalized Koopman operator solutions and nonlinear fundamental lemma for data-driven control of nonlinear systems.
Reasoning-guided LLM function completion using context when docstrings are absent in real-world code repositories.
Proposes geometric-structural dual-guided framework for medical image segmentation robust to noisy labels.
Analyzes fine-tuning image editing models versus text-to-image generators as foundations for dense geometry prediction tasks.
DreamAudio enables fine-grained control over acoustic characteristics in text-to-audio generation using diffusion models.
Applies classifier-free guidance from image generation to zero-shot text-to-speech synthesis for balancing speaker fidelity and text adherence.
MARS proposes efficient multi-agent collaboration framework for LLM reasoning, reducing computational overhead of Multi-Agent Debate while maintaining reasoning capabilities.
VL-KnG constructs spatiotemporal knowledge graphs from egocentric video using vision-language models for persistent scene understanding without 3D reconstruction.
Self-correction Loop with Structured Output framework enhances GPT-based VLMs for generating reliable dental radiological findings in medical image interpretation.
Counterfactual identification framework using dynamic optimal transport addresses causal inference from observational data with high-dimensional multivariate outcomes.
Study investigates transliteration methods for bridging multilingual NLP gaps, examining shared scripts, vocabularies, and phonology in non-Latin languages.
Information Gain-based Policy Optimization uses RL to train LLM agents for multi-turn search with tool use, addressing reward sparsity in exploration-based tasks.
MCP Security Bench systematically evaluates attacks against Model Context Protocol in LLM agents, measuring resistance of tool-calling systems to adversarial inputs.
GUIrilla is a scalable framework for automated desktop UI exploration generating large-scale training data for LLM-based GUI understanding and automation.
MeasureBench benchmarks vision-language models on visual measurement reading tasks with real-world and synthesized instrument images.
Xmera framework evaluates adversarial man-in-the-middle attacks on LLM factual recall through prompt injection, measuring vulnerability of question-answering systems.
MOON2.0 addresses multimodal imbalance in MLLMs for e-commerce product understanding through dynamic modality-balanced representation learning.
Multimodal fusion network for pedestrian crossing intention prediction in autonomous vehicle systems.
HumorChain: Theory-guided multi-stage reasoning framework for interpretable multimodal humor generation using LLMs.
Study on spatial reasoning in LLMs for 3D scene understanding, examining attention masking mechanisms for order-agnostic objects.
ThinkDeeper: Framework for autonomous vehicle grounding using world models for 3D spatial reasoning and scene prediction.
ArcGD: Geometrically motivated gradient descent optimizer with phase-aware step dynamics, evaluated on benchmarks.
Research on metaphor-based jailbreak attacks against text-to-image models' safety defense mechanisms.
Zero-shot object navigation for robots using ensemble prediction of future states in unseen, cluttered environments.
Empirical study examining reproducibility gaps in code generated by LLM coding agents and missing dependency specifications.
VLM-CAD: Collaborative agent design workflow for analog circuit sizing using vision-language models with spatial reasoning.
Information-theoretic analysis of trade-offs between fairness, privacy, and accuracy in machine learning using Chernoff Information.
HAVEN: Framework for long-video understanding using agentic search and audiovisual entity cohesion to maintain global coherence.
Analysis of representational homomorphism in transformers to predict and improve compositional generalization in language models.
Vision-DeepResearch: Framework augmenting multimodal LLMs with tool-calling capabilities for visual and textual search.
1S-DAug: One-shot data augmentation method for improved few-shot learning generalization using generative synthesis.
Residual Decoding: Training method to reduce hallucinations in vision-language models using history-aware residual guidance.
FlyPrompt: Brain-inspired routing method for continual learning from non-stationary data streams without task boundaries.
Study evaluating behavioral consistency of LLM agents in stock market simulations against real market participant behavior.
Energy-aware reinforcement learning for robotic manipulation of articulated objects in infrastructure maintenance and smart cities.
KDFlow: Framework for efficient knowledge distillation of large language models into smaller models with heterogeneous training backends.
MA-RAG: Multi-round agentic RAG system for medical reasoning with LLMs, addressing hallucinations and outdated knowledge through iterative refinement.
Augmenting Proximal Policy Optimization with temporal sequence models for robust reinforcement learning under sensor drift and partial observability.
Wi-Fi based human presence detection using monostatic Doppler spectrum on commodity laptops without external sensors.
NCCL EP, unified communication API for mixture-of-experts architectures in large language models built on NCCL with GPU-initiated RDMA.
Training-free fine-grained visual recognition using large vision-language models with sample-wise adaptive reasoning for subordinate-level category disambiguation.
Video world models for robotics using inverse dynamics rewards to align generated trajectories with executable robot actions.
Systematic analysis of Elastic Weight Consolidation for continual learning showing suboptimal performance and proposing improvements to weight importance estimation.
Benchmark comparing PETNN, KAN, and classical deep learning models on Burmese handwritten digit recognition dataset.
Mi:dm K 2.5 Pro, 32B parameter enterprise LLM supporting multi-step reasoning, long-context understanding, and agentic workflows in Korean and domain-specific applications.
Formal specification for admission control governance of autonomous agents in institutional B2B environments with cryptographic validation.
Graph-based memory system for LLM reward prediction requiring limited labeled data for reinforcement learning post-training.