Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection
ImageProtector method prevents multi-modal LLMs from analyzing images via visual prompt injection for privacy protection.
ImageProtector method prevents multi-modal LLMs from analyzing images via visual prompt injection for privacy protection.
Multi-agent mixture of experts with plasticity enhancement for UAV communication networks under non-stationary conditions using deep RL.
Proposes Advantage-Guided Diffusion for model-based RL using diffusion world models to reduce compounding errors in trajectory generation.
Continual visual place recognition system for aerial autonomy addressing catastrophic forgetting using geometric memory management in dynamic environments.
NyayaMind framework for transparent legal judgment prediction in Indian courts using structured reasoning aligned with legal methodology.
CLIP-Inspector framework for detecting backdoor attacks in prompt-tuned CLIP models via out-of-distribution trigger inversion.
Dynamic Assembly Forest model detecting diffusion-generated images using ensemble methods, alternative to deep neural network approaches.
FIRE-CIR framework for composed image retrieval using vision-language models with fine-grained reasoning about what to preserve and modify.
MATCHA: DNN deployment framework generating concurrent schedules for heterogeneous multi-accelerator edge SoCs using constraint programming optimization.
Theoretical framework for identifying causal effects using single proxy variables of unobserved confounders under completeness assumptions.
Energy-Shifting deep learning framework for accelerating Monte Carlo dose calculation in radiotherapy by synthesizing distributions from monoenergetic inputs.
MixFlow method improving diffusion models by using mixed source distributions instead of standard Gaussian to reduce generative path curvature.
Symbolic-Neural Consistency Audit (SNCA) framework that extracts LLM self-stated safety policies via prompts and verifies model adherence to them.
YOLOv8-based facade parsing system augmented with alignment loss to enforce structural coherence in architectural element detection.
Riemannian gradient descent approach for optimizing low-rank functional tensor networks on arbitrary loss functions beyond least-squares regression.
Online intention prediction framework for autonomous systems using inverse reinforcement learning with time-varying objectives and unknown parameters.
Iterative Identification Closure framework for determining causal identifiability in linear structural equation models with latent confounders.
Fragment-based graph neural network integrated with many-body expansion theory for predicting potential energy surfaces in chemical systems.
CrossAbSense framework using protein language model encoders and attention decoders to predict antibody properties for therapeutic design validation.
Hybrid quantum-classical physics-informed neural networks for hydrological modeling with uncertainty quantification using variational quantum circuits.
Theoretical analysis of loss landscape in two-layer ReLU neural networks, characterizing local minima and their connection to stochastic gradient descent dynamics.
Learning-to-Defer framework that routes inputs to experts while selecting additional information (retrieved documents, tool outputs) to provide each expert, extending traditional routing systems.
Large-scale synthetic dataset with 2M videos covering physical phenomena for training physics-aware AI systems.
Systematic comparison of LLM task adaptation strategies including instruction revision, prompt optimization, and retrieval methods.
Video diffusion model learning joint distribution of video frames and camera trajectories for novel view synthesis.
Neural network architecture for haptic signal prediction in tactile internet using mode decomposition.
Open-source dataset and code for classifying human activity from accelerometer sensor data.
Model poisoning attack on federated learning without client collusion using independent adversarial updates.
Post-training method enabling LLMs to retrieve and reason over long-context information effectively.
Video prediction model representing scene dynamics as sparse point trajectories for efficient future frame synthesis.
Framework for training LLMs to make evidence-dependent predictions by grounding supervision in case-specific evidence.
Mechanistic study using weight pruning to identify unified internal mechanism LLMs use for generating harmful content.
Data-free meta-learning robustness analysis examining failure modes when learning from pre-trained models without training data.
MARL method using temporal sparse coordination graphs to improve agent cooperation from historical experiences.
Multi-agent reinforcement learning coordination via graph structures capturing higher-order group relationships.
Self-supervised learning for ECG signal representation using masked modeling in medical domain.
GNN scalability method using graph coarsening to reduce inference-time computational costs.
Diffusion model approach for defending graph neural networks against adversarial attacks.
Graph neural network architecture using Mamba state space models to address over-smoothing in deep GNNs.
Method using low-rank techniques for Bayesian uncertainty quantification in neural networks via Laplace approximation.
Research on polysemanticity in LLMs showing neurons encode multiple concepts, challenging discrete attribution methods for model interpretability.
Research on reducing class bias in balanced datasets using hardness-based resampling instead of frequency-based methods.
Federated continual fine-tuning with low-rank residual adaptation, enabling efficient parameter-efficient learning across new classes in federated settings.
Proxy model framework for efficient post-hoc interpretability of LLMs, reducing computational costs of model-agnostic explanations.
Theoretical analysis of OPTQ/GPTQ post-training quantization for LLMs, providing rigorous quantitative guarantees for PTQ algorithms.
Kolmogorov-Arnold networks with autoregressive weights for time series forecasting, extending comparisons beyond LLMs and FNNs.
Spatial-temporal weather forecasting with adaptive boundary alignment for regional integration from global atmosphere predictions.
Configuration-aware LoRA adaptation for quantized LLMs enabling efficient edge device deployment with heterogeneous capabilities.
RECAP: RL method for safety alignment in large reasoning models, teaching critical evaluation of flawed premises via counter-aligned prefilling.
Open dataset of batch distillation experiments for developing ML anomaly detection methods in chemical processes.