Interpretability study probing internal representations of world models (IRIS and DIAMOND) in RL using linear/nonlinear probing and causal interventions.
Information-theoretic analysis of LLM steganography showing Kolmogorov complexity bounds on hidden payload embedding in text while preserving semantic meaning.
SSAM method merges multiple pre-trained multimodal LLMs without additional training by aligning singular subspaces, enabling efficient multi-modality integration.
Lightweight autoencoder-based anomaly detection using federated learning for IoT networks, enabling privacy-preserving security monitoring on resource-constrained devices.
Framework for building general-purpose Graph Foundation Models using Riemannian geometry principles, analogous to large language models for graph-structured data.
mSFT algorithm for multi-task supervised fine-tuning that addresses heterogeneous overfitting by dynamically adjusting compute budget per dataset to balance learning rates.
Bayesian framework for compliance monitoring in rule-governed domains, inferring latent states given known rules rather than learning rules from data.
Multimodal time series anomaly detection model combining numerical and semantic data with alignment and interaction mechanisms for dynamic system monitoring.
GSB-PPO extends proximal policy optimization to trajectory-level generative policies using Schrödinger Bridge perspective, enabling diffusion and flow-based policy optimization.
Session-based graph learning model for predicting next mobile app launches by modeling multi-hop intent patterns and handling sparse/cold-start user profiles.
Federated learning framework for privacy-preserving medical AI training across healthcare institutions while addressing data heterogeneity and deployment challenges.
Model merging technique using Fisher Information to combine long-chain-of-thought and base LLMs, preserving reasoning accuracy while reducing output length without additional training.
Multi-armed bandit approach for selecting among generative models under diversity-aware metrics, addressing efficient model selection in generative AI without relying on classical UCB algorithms.
arXiv paper on uncertainty quantification for distribution-to-distribution flow matching in scientific imaging applications.
FISformer replaces self-attention with fuzzy inference systems in transformers for time series forecasting, addressing uncertainty modeling limitations of dot-product attention.
Post-training virtual cell models with RL using biologically-constrained reward functions for drug discovery simulation.
Precipitation nowcasting approach combining radar imagery with weather foundation model predictions via spectral fusion.
Method for analyzing feature invariances in ML models by sampling from learned equivalence classes without dedicated generators.
Lightweight adapter module enhancing time series foundation models by incorporating correlation information across channels.
Benchmark dataset and baselines for PPG-based clinical prediction tasks from MIMIC-III data.
Analysis of computational complexity in constraint-based causal discovery algorithms using conditional independence tests.
Scaling laws for Mixture-of-Experts architecture design balancing global interactions and MoE-specific variables in LLMs.
Joint optimization of RL policies and LLM prompts for improving reasoning with verifiable rewards on hard samples.
Parameter-efficient vector-quantized UNet variant for weather precipitation nowcasting with reduced computational requirements.
Energy optimization technique for edge device inference using fine-grained DVFS scaling aware of network sparsity.
Analysis of temporal difference error interpretations in deep reinforcement learning and impact on critic loss formulation.
Systematic empirical study on scaling RL for autonomous LLM agents with long-horizon tool orchestration using TravelPlanner benchmark.
Gradient-boosted decision trees method for power flow analysis in distribution systems using sequential path-based learning.
Framework for explaining trajectories in multi-objective reinforcement learning agents handling conflicting objectives.
Learning-based approach to parameterize GELU activation functions for converting smooth networks to piecewise-linear ReLU equivalents.
Non-parametric conformal regression method using binning optimization with CRPS metric for conditional distribution estimation.
Method to reduce overthinking in Large Reasoning Models by detecting and stopping redundant reasoning steps, lowering latency and compute costs.
AdditiveLLM2 domain-adapted multimodal LLM based on Gemma 3 for additive manufacturing using instruction tuning on domain corpus.
Framework and benchmark for detecting inconsistencies between research papers and their implementations in bioinformatics software.
Analysis of how overparametrization and priors interact in Bayesian neural network posteriors and their effects on inference.
Study on why topic-matched contrast baselines fail in directional refusal abliteration for removing safety behaviors from LLMs.
MIHT algorithm for time series classification using multi-instance learning on variable-length and high-dimensional temporal data.
Analysis of reinforcement learning with verifiable rewards for LLM reasoning, focusing on direction rather than magnitude of weight updates.
Computationally efficient classifier with frequentist uncertainty bounds suitable for safety-critical applications.
Trainable activation function family (dynActivation) providing adaptive nonlinearity for vision and language modeling tasks.
RAMPAGE algorithm addressing discretization bias in extragradient methods for variational inequalities with variance reduction.
Multimodal survival analysis combining clinical text, tabular data, and genomics using locally deployable lightweight LLMs for privacy-constrained settings.
Causal investigation of whether LLMs use internal confidence estimates to regulate behavior through abstention paradigm experiments.
Theoretical framework reducing calibration of forecasts to online learning techniques with results for general proper losses.
Study on incorporating domain knowledge into LLM-based code generation for quantum software development while maintaining maintainability.
Chimera serving system for multi-agent LLM workflows optimizing latency and performance on heterogeneous model deployments.
SPA baseline method using prompt engineering to generate synthetic data for knowledge injection into LLMs in specialized domains.
Benchmarking methodology for probabilistic time series forecasting using noise titration to test model robustness to non-stationarity.
Decoding strategy analysis for diffusion language models showing confidence-based decoding is provably efficient for parallel token generation.
Reinforcement learning approach decoupling exploration and policy optimization using uncertainty-guided tree search for autonomous agent exploration.