Early Discoveries of Algorithmist I: Promise of Provable Algorithm Synthesis at Scale
Study on using LLMs for algorithm synthesis with provable guarantees, combining mathematical reasoning with practical performance.
Study on using LLMs for algorithm synthesis with provable guarantees, combining mathematical reasoning with practical performance.
Research on improving conditional modeling in diffusion models, establishing equivalence between classifier-free guidance and alignment objectives.
Quantum-enhanced graph neural network for network intrusion detection exploiting relational dependencies in traffic flows.
Quantum federated autoencoder framework for anomaly detection in IoT networks using quantum computing and federated learning.
Proposes Reasoner-Executor-Synthesizer architecture for LLM agents that maintains O(1) context window while avoiding hallucination and token cost scaling.
Research evaluating Vision-Language Models' ability to detect misleading data visualizations and deceptive captions in charts.
Multimodal fusion framework for synthetic lethality prediction in cancer drug development addressing modality laziness problem.
FAAR quantization method for NVFP4 ultra-low-bit format that adapts rounding to non-uniform numerical grid for efficient LLM edge deployment.
Study on multimodal fusion strategies for time series forecasting showing naive fusion fails and proposing constrained fusion approach.
MTEO method for few-step diffusion sampling by distilling layer-wise, step-wise time embeddings to accelerate inference.
AI Co-Scientist framework combining LLM agents with cloud computing to automate search ranking research from ideation through GPU training.
PhD thesis on classification and segmentation of gastrointestinal tract images for real-time medical diagnosis applications.
Cross-task evaluation study of LoRA adapters showing nominal instruction-tuning labels don't reliably predict realized instruction-following capabilities.
Symbolic Graph Network framework for discovering partial differential equations from noisy sparse data without numerical differentiation.
Adaptive temporal control system for autonomous agents that learns optimal action intervals using hyperbolic geometry predictive signals.
Quantum Wasserstein GAN with latent style representation for de novo drug design using generative AI.
Open-source framework (CaP-X) for benchmarking and improving code-as-policy agents for robot manipulation tasks.
Token-level analysis of distributional shifts in RLVR fine-tuning of LLMs to understand mechanisms underlying reasoning improvements.
LLM-guided headline rewriting system that enhances reader engagement while maintaining editorial integrity and avoiding clickbait.
Edge AI video sensing paradigm using grayscale capture with selective RGB frames to reduce bandwidth and computational requirements.
Online adaptation method for neural closed-loop control systems that preserves stability while updating controllers during operation.
Ablation study analyzing specialization patterns in hybrid language models combining attention with state space models on sub-1B parameter models.
Framework for building language model general capabilities via automatic curriculum of cross-entropy game tasks for relevant skill discovery.
Inference-time scaling method using small latent verifiers instead of multimodal LLMs to score and select outputs while reducing computational cost.
Empirical study measuring semantic novelty of 13,847 IS papers (2020-2025) to assess whether LLM productivity gains translate to genuine intellectual advancement.
Deep learning framework for flood detection using satellite imagery and random forest-derived labels to map flood extent during disasters.
LLMON proposes a markup language for LLMs that preserves structure and semantics in prompts, distinguishing between instructions and data in input/output.
ChatP&ID is an agentic RAG framework enabling LLM interaction with engineering P&ID diagrams using knowledge graphs for cost-effective grounded reasoning.
Ego2Web benchmark for multimodal web agents grounded in egocentric video, evaluating agents performing real-world workflows with physical context awareness.
STRIATUM-CTF is an agentic framework using search-based reasoning for automated cybersecurity CTF challenge solving with multi-step stateful reasoning.
Study evaluating faithfulness of chain-of-thought reasoning in LLMs, finding models often produce misleading explanations despite correct outputs.
flexvec is a SQL vector retrieval kernel exposing embedding matrices and score arrays for programmatic manipulation by AI agents via Programmatic Embedding Modulation.
Method leveraging vision-language models to explain sparse autoencoder features in vision models through causal interventions instead of correlation-based approaches.
Survey studying consumer acceptance of AI in moral compliance roles versus moral decision-making across five studies.
Theoretical work on causal discovery in chain-reaction systems using interventional data, proving identifiability under cascade-like structural assumptions.
Evaluation of medical vision-language models revealing a grounding-sycophancy tradeoff, analyzing hallucination and agreement behaviors across six VLMs.
Benchmark for evaluating attribution map faithfulness in semantic segmentation models, testing intervention-based faithfulness and perturbation robustness.
LGSE framework for adapting pretrained language models to low-resource languages using morphologically grounded subword embeddings instead of arbitrary segmentation.
Study examining whether humans can learn to recalibrate AI confidence signals through repeated interaction, testing four calibration conditions with 200 participants.
AwesomeLit proposes an agent-supported literature research system for hypothesis generation, designed for inexperienced researchers to identify gaps and propose feasible research directions.
Neural dynamics modeling approach from representation perspective for system behavior learning.
Vision-based deep learning method for unordered biomedical tabular data via optimal spatial cartography.
Wi-Fi CSI-based system for privacy-preserving semantic action captioning using limb-level alignment.
Population-representative resume dataset for causal fairness auditing of LLM/VLM-based screening systems.
Quantitative model predicting when post-hoc fusion of independent LLM specialists outperforms individual models.
Persona-based data augmentation framework using LLMs for legal domain information retrieval in low-resource settings.
Bayesian visualization interface supporting multi-issue human-AI negotiation to manage cognitive load.
Study of neural network resilience to hardware bit-flip errors, comparing logic-based vs arithmetic architectures.
LLM fine-tuning framework addressing knowledge-action gap in personalized e-commerce search at Taobao.
Hospital study measuring fall rates using continuous AI monitoring systems from August 2024 to December 2025.