Bid2X: Revealing Dynamics of Bidding Environment in Online Advertising from A Foundation Model Lens
Foundation model approach to auto-bidding in online advertising that generalizes across different bidding scenarios.
Foundation model approach to auto-bidding in online advertising that generalizes across different bidding scenarios.
EcoAlign framework balances safety, utility, and computational cost in aligning Large Vision-Language Models against jailbreak attacks.
Echo-CoPilot: agentic framework combining multi-perspective workflow with knowledge-graph guidance for reliable echocardiography interpretation.
EventGPT applies GPT-based framework to forecast football player transfer success by analyzing team action sequences and tactical context.
Reason2Decide: two-stage training framework for clinical decision support LLMs to generate predictions with self-aligned explanations.
MultiSessionCollab benchmark and method for long-term conversational agents to learn and leverage user preferences across multiple sessions.
Extends Dung's abstract argumentation framework with subargument relations and structural dependencies for formal argumentation systems.
Position paper arguing agentic evolution via deployment-time adaptation is needed to close the train-deploy gap in LLM systems.
Method for compiling random forest classifiers into circuits for explainability and tractable computation of complete generalizations.
Formalizes causal Rung Collapse where LLMs learn spurious associations instead of causal relationships, proposes epistemic regret minimization solution.
Study showing fine-tuning vision-language agents on narrow harmful datasets causes emergent misalignment generalizing across unrelated tasks and modalities.
BAPO: off-policy reinforcement learning framework improving data efficiency in LLM post-training by selecting diverse training experiences.
Aletheia mathematics research agent solved 6 of 10 FirstProof challenge problems autonomously using Gemini 3 Deep Think reasoning.
Framework for decision-level evaluation of AI agents in AutoML pipelines beyond outcome metrics, assessing intermediate reasoning steps.
Survey and framework for personalized LLM-powered agents that adapt to individual users over extended interactions with evaluation methods and research directions.
Human study measuring whether LLM access improves novice performance on biology tasks versus internet-only baselines, with dual-use risk implications.
EMPA framework evaluates how well LLM dialogue agents maintain persona-aligned empathy across multi-turn conversations using process-oriented metrics.
WebChain: 31,725 human-annotated web interaction trajectories with 318k steps in multi-modal format for training and evaluating web agents.
LLMs used to synthesize executable game design patterns from high-level gameplay ideas, focusing on goal patterns and structural constraints in game creation.
Framework for automated frontier AI risk evaluation using executable code environments and LLM-based simulators.
Offline reinforcement learning method using robust policy optimization under distribution shift and transition uncertainty.
Benchmark evaluating LLM ability to generate interactive HTML-based MiniApps with dynamic interfaces and logic.
Study showing reasoning and deliberation increase honesty in LLM responses on moral trade-off scenarios.
Framework for efficient LLM distillation that focuses training on problems at frontier of student capability.
Protocol for detecting self-preservation behaviors in autonomous agents to distinguish intrinsic from instrumental objectives.
Framework coordinating multiple LLM-based agents through verification loop for complex query resolution with DAG decomposition.
Benchmark with 6,372 multimodal reasoning instances that evaluates LLM reasoning transparency through verifiable intermediate steps.
Research on semantic invariance property of LLM-based autonomous agents under input variations to ensure stable reasoning.
Unsupervised pre-training framework for point cloud representations using contrastive learning and clustering.
Method using counterfactual thinking to identify and address bias and fairness issues in machine learning models.
Neural network framework for handling geometrically distorted images in computer vision tasks.
Research on automated prompt generation and optimization techniques for improving LLM performance through meta-prompting approaches.
Deep learning approach for 3D structure and camera reconstruction from 2D landmarks using foundation models.
Method for detecting out-of-distribution samples in machine learning using density estimation with conjugate normalization.
Ayn: domain-specific tiny language model pretrained from scratch for Indian legal NLP tasks as alternative to large LLMs.
Survey examining computerized adaptive testing through machine learning lens, covering personalized assessment methods across domains.
Universal approximation theorem and operator learning methods for continuous nonlinear operators in Banach spaces using orthogonal projections.
Multi-task learning AI system for basal cell carcinoma detection with dual explanation mechanisms for clinical transparency.
TraffiDent dataset aligning traffic dynamics and incident data across 16,972 nodes for understanding their interplay.
Analysis of how skip connections in deep networks enhance adversarial example transferability across models.
Time series forecasting approach accounting for latent confounders using causal inference to improve prediction accuracy.
Causal inference method using LLMs to quantify effects of textual interventions on social systems from observational data.
VisionZip reduces computational costs in vision-language models by compressing redundant visual tokens while maintaining performance.
Multi-modal learning approach addressing modality imbalance in 3D human pose estimation using RGB and non-intrusive sensors.
Virtual full-stack brain MRI scanning method that imputes missing acquisition modalities from incomplete MRI data using learned representations.
Novel loss function (L1DFL) for detecting prostate cancer in PET/CT images using deep neural networks with voxel-weighted optimization.
SyncSpeech proposes Temporal Masked Transformer for efficient, low-latency text-to-speech combining benefits of autoregressive and non-autoregressive models.
Deep learning approach for breast cancer subtype prediction using adaptive methods to address class imbalance and domain shift challenges.
RRNCO addresses real-world deployment of neural combinatorial optimization for vehicle routing by handling asymmetric costs and edge-based features.
Review of LLM-driven approaches for creating virtual agents with personality in VR environments using multimodal outputs.