How Do Latent Reasoning Methods Perform Under Weak and Strong Supervision?
Empirical study of latent reasoning methods under weak and strong supervision for multi-step LLM reasoning.
Empirical study of latent reasoning methods under weak and strong supervision for multi-step LLM reasoning.
BanglaBERT with stacked LSTM for multi-label cyberbullying detection in Bangla text.
Uncertainty-aware policy steering using VLM verifiers to adapt robot behaviors by selecting aligned action samples.
VeRO: Evaluation framework for assessing coding agents that optimize other agents through iterative edit-execute-evaluate cycles.
Flow matching generative model that adapts to manifold structures, offering simulation-free alternative to diffusion models.
Theoretical analysis of scaling limits from shallow Bayesian neural networks to Gaussian processes with scalable inference methods.
TFPS: Temporal filtration method for constructing positive sample sets in implicit collaborative filtering recommendation systems.
AI-driven ensemble forecast system for tropical cyclone prediction combining dynamics models with AI optimization.
HARU-Net: Hybrid attention residual U-Net for denoising cone-beam CT medical images.
Dynamic dense retrieval with routing strategy for adapting information retrieval models across domains without full retraining.
Column generation algorithm for identifying carcinogenic multi-hit gene combinations in cancer research.
CourtGuard: Model-agnostic multi-agent framework for zero-shot LLM safety policy adaptation using retrieval-augmented debate.
Search-P1: Path-centric reward shaping for training agentic RAG systems with improved sample efficiency via RL.
Item Response Theory approach to correct systematic rater biases in human evaluations for AI model assessment.
SideQuest: Model-driven KV cache management technique for long-context agentic reasoning tasks with multi-hop retrieval.
RL-based approach for generating hardware model checking benchmarks at algorithmic level.
Hybrid ML framework combining autoencoders and transformers for accelerator beam diagnostics simulations.
Novel learning approach for designing KKL observers in non-autonomous nonlinear systems using neural networks to approximate PDEs.
Trie-based constrained decoding optimization for LLM generative retrieval on accelerators. Improves business logic constraints in recommendations.
dLLM: Unified framework for diffusion language models. Standardizes components across research implementations for reproducibility.
DPSQL+: Differentially private SQL library with minimum frequency rule. Privacy-preserving data analysis.
GR4AD: Production generative recommendation system for large-scale advertising using LLMs. Architecture, learning, and serving optimization.
AMA-Bench: Benchmark evaluating long-horizon memory for LLM-based agentic applications. Addresses gap between dialogue benchmarks and real agent scenarios.
QSIM: Multi-agent reinforcement learning method addressing Q-value overestimation via action similarity weighting. Value decomposition improvement.
Research on diffusion models for end-to-end autonomous driving in real-world settings. arXiv paper exploring decision-making applications.
FlexMS framework for benchmarking deep learning mass spectrum prediction tools in metabolomics and drug discovery.
TARAZ benchmark for evaluating cultural competence of LLMs in Persian with short-answer format and morphological analysis.
Unsupervised continual learning framework for amortized Bayesian inference handling sequential data and distribution shifts.
SPD Learn Python library for symmetric positive definite matrix neural networks in neural decoding with geometric deep learning.
OmniGAIA benchmark for evaluating omni-modal AI agents with unified vision, audio, and language perception capabilities.
SIGMA system applying LLMs to multi-task recommendation at scale, handling diverse business requirements beyond traditional next-item prediction.
Large deviations theory analysis of wide Bayesian neural networks studying posterior concentration and feature learning.
Parameter-efficient fine-tuning combining multiple domain expert models for visual adaptation tasks via prompt tuning.
Fourier feature method for efficient nonstationary Gaussian process simulation with spectral techniques.
Sequential regression approach using residual quantization for continuous value prediction in recommendation systems.
Training-free few-shot anomaly detection using foundation model features and subspace modeling for industrial inspection.
Theoretical lower bounds for clustering in moderate dimensions using low-degree polynomial analysis.
Tree-based system for analyzing semi-structured documents with mixed content types enabling question-answering over complex layouts.
LLM agent framework (SALA) for evaluating deanonymization risks through stylometry-assisted analysis of textual data with interpretable pipeline.
Watermarking technique for protecting quantum circuits as intellectual property in quantum cloud computing platforms.
Diffusion-based approach for multi-behavior sequential recommendation systems capturing dynamic user preferences across heterogeneous interaction types.
Study of information-theoretic limits in multimodal LLMs showing modality-specific information is discarded during decoding despite surviving encoding layers.
SettleFL protocol for decentralized reward settlement in federated learning using blockchain, addressing scalability via off-chain batching.
arXiv paper: Fairness-aware mixed-precision quantization for neural network compression in medical imaging, explicitly addressing algorithmic fairness during model compression.
arXiv paper: Theoretical analysis of fine-tuning effects on in-context learning in linear attention transformers, balancing downstream and few-shot task performance.
arXiv paper: Plug-and-play diffusion framework with ADMM for medical image reconstruction addressing memory issues in PnP solvers.
arXiv paper: Large-scale app store search ranking system augmented with LLM-generated textual relevance judgments to address scarcity of expert labels.
arXiv paper: Zeroth-order optimization for leader-follower Stackelberg control in combinatorial congestion games with discrete strategy selection.
arXiv paper: Training-free hierarchical manifold guidance for dataset distillation using diffusion models without requiring gradient computation.
arXiv paper: Zero-shot and one-shot adaptation of small language models for leader-follower role assignment in human-robot interaction with resource constraints.