How Similar Are Grokipedia and Wikipedia? A Multi-Dimensional Textual and Structural Comparison
Computational analysis comparing 17,790 articles between Grokipedia (AI-generated) and Wikipedia examining textual and structural biases.
Computational analysis comparing 17,790 articles between Grokipedia (AI-generated) and Wikipedia examining textual and structural biases.
EGMOF: hybrid diffusion-transformer for metal-organic framework generation with inverse design capabilities for materials discovery.
Inference-time optimization using evolutionary algorithms on prompt embeddings for diffusion model control without fine-tuning.
Structured uncertainty framework for LLM agents with tool-calling to generate principled clarifying questions for ambiguous user instructions.
Language-conditioned humanoid robot control using LLM with unified motion vocabulary for free-form command execution and embodied AI.
Bharat Scene Text dataset and benchmark for Indian language scene text recognition addressing script diversity and font variations.
AV-SpeakerBench: multimodal LLM benchmark with 3,212 questions evaluating audiovisual speech understanding and speaker-speech alignment in video.
Analysis of flow-based diffusion models revealing two-stage behavior through oracle velocity field computation and memorization-generalization tradeoffs.
Research on adversarial perturbations for object detectors using black-box attacks to expose vulnerabilities and understand attack mechanisms.
Research on self-distillation methods for teaching language models to leverage cognitive skills like verification and backtracking without base model exposure.
Research on relational visual similarity in computer vision showing how humans perceive analogical relationships beyond attribute similarity.
Framework combining mechanism design and online learning for sequential mechanism design where principal learns agent beliefs while ensuring truthfulness.
Mechanistic study of self-reflection emergence in RL-trained LLMs, proposing two-stage decision-sampling hypothesis to explain unified optimization producing distinct capabilities.
White-box adversarial attack method on computer vision models using SHAP values to generate imperceptible evasion attacks.
Training-free framework for human video animation using cached reference frames to model long-range dependencies while preserving temporal coherence.
Analysis showing layer pruning of LLMs degrades generative reasoning tasks beyond surface degradation, causing loss of algorithmic capabilities.
Method addressing prompt misguidance in diffusion-based super-resolution by using tiled prompts for localized semantic guidance.
Multi-agent framework for smart contract auditing using specialized agents for planning, execution, and recovery with coordination protocols.
Study demonstrating LLM biases when simulating misinformation susceptibility, showing models overstate attitudes and ignore network effects present in humans.
Qualitative study of 33 K12 teachers' perspectives on using conversational AI agents to scaffold group collaboration in classrooms.
Adaptive framework for demand forecasting model selection addressing horizon-induced performance degradation in inventory planning.
Pipeline combining subquadratic retrieval and GPU-accelerated kernels for analyzing immune repertoires at population scale.
Dataset of parasitoid wasps and hymenoptera for taxonomic identification and biodiversity monitoring.
Knowledge distillation method for distilling RL-trained LLMs with chain-of-thought reasoning into smaller student models while preserving reasoning capabilities.
Theoretical analysis explaining why Adam optimizer outperforms SGD through second-moment normalization using stopping-time and martingale analysis.
Analysis showing chain-of-thought prompting underperforms direct answering in medical vision-language models due to perception bottlenecks in domain-specific tasks.
Memory-efficient continual learning method using prototypical exemplar condensation to reduce storage requirements while maintaining performance.
Parallel framework combining imitation and reinforcement learning for autonomous driving, addressing limitations of sequential fine-tuning approaches.
Method to improve pretrained generative robot policies by replacing sampled noise with optimized constant noise vectors for downstream reward optimization.
Mid-training adaptation strategy for LLMs to improve automatic summarization of radiology reports, exploring domain-specific pre-training approaches.
RAM: motion capture system for 3D human pose reconstruction in unconstrained video with occlusion handling and temporal smoothing.
ChronoCon: contrastive learning approach for disease progression assessment from longitudinal medical imaging without explicit severity annotations.
CAIAMAR: multi-agent framework for context-aware image anonymization in street-level imagery using agentic reasoning.
Kill-chain canary methodology for tracking prompt injection attacks across multi-agent LLM systems with stage-level diagnostics.
System for making mathematical theorems interactive by grounding LLM-generated explanations in formal representations enabling execution and stepping.
Framework for eliciting and verbalizing LLM assumptions to explain and mitigate sycophantic behavior in model outputs.
Multi-stage LLM-assisted workflow for scientific algorithm development separating theory extraction, formal specification, and code generation.
Method for LLM personalization using a small portfolio of models capturing diverse user preferences without per-user models.
Distributional reinforcement learning approach for decision-making in healthcare, accounting for uncertainty across heterogeneous populations.
ALTO: system for adaptive LoRA hyperparameter tuning and orchestration across heterogeneous LLM fine-tuning workloads in multi-tenant environments.
DiffHDR: video diffusion model approach for converting low-dynamic-range videos to high-dynamic-range format.
WisdomInterrogatory (LuWen): open-source Chinese legal language model built on Baichuan foundation model for legal domain applications.
System for safe capability evolution in embodied agents with compatibility checking and runtime rollback mechanisms.
Training-free open-vocabulary semantic segmentation framework (OV-Stitcher) leveraging pretrained vision-language models without additional training.
HyperMem: hypergraph-based memory architecture for conversational agents enabling long-term context tracking and high-order associations.
Quantum-inspired ARIMA methodology combining quantum autocorrelation with variational quantum circuits for time series analysis.
Vision-language benchmark (CrashSight) for evaluating traffic crash scene understanding from infrastructure perspective.
Physics-aligned simulator (SIM1) for generating synthetic data in deformable object robotic manipulation tasks.
Framework combining LLMs with graph neural networks for text-attributed graph learning in low-resource settings using GNN feedback.
Bayesian optimization method (MG-TuRBO) for high-dimensional traffic simulation calibration, comparing genetic algorithms with Bayesian approaches.