OAT: Ordered Action Tokenization
OAT: Ordered Action Tokenization
OAT: Ordered Action Tokenization
Evaluating Kubernetes Performance for GenAI Inference: From Automatic Speech Recognition to LLM Summarization
From Fragmentation to Integration: Exploring the Design Space of AI Agents for Human-as-the-Unit Privacy Management
Variational Speculative Decoding: Rethinking Draft Training from Token Likelihood to Sequence Acceptance
Implementing Grassroots Logic Programs with Multiagent Transition Systems and AI
Bielik Guard: Efficient Polish Language Safety Classifiers for LLM Content Moderation
MIND: Benchmarking Memory Consistency and Action Control in World Models
Breaking the Simplification Bottleneck in Amortized Neural Symbolic Regression
CIC-Trap4Phish: A Unified Multi-Format Dataset for Phishing and Quishing Attachment Detection
Learning to Remember, Learn, and Forget in Attention-Based Models
SWE-AGI: Benchmarking Specification-Driven Software Construction with MoonBit in the Era of Autonomous Agents
EcoGym: Evaluating LLMs for Long-Horizon Plan-and-Execute in Interactive Economies
On the Optimal Reasoning Length for RL-Trained Language Models
A Controlled Study of Double DQN and Dueling DQN Under Cross-Environment Transfer
Text summarization via global structure awareness
Monocular Normal Estimation via Shading Sequence Estimation
Infusion: Shaping Model Behavior by Editing Training Data via Influence Functions
RoboSubtaskNet: Temporal Sub-task Segmentation for Human-to-Robot Skill Transfer in Real-World Environments
Fake-HR1: Rethinking Reasoning of Vision Language Model for Synthetic Image Detection
Large Language Models Predict Functional Outcomes after Acute Ischemic Stroke
Towards Autonomous Mathematics Research
Signature-Kernel Based Evaluation Metrics for Robust Probabilistic and Tail-Event Forecasting
Versor: A Geometric Sequence Architecture
Adaptive Optimization via Momentum on Variance-Normalized Gradients
Neural Network Quantum Field Theory from Transformer Architectures
How Much Reasoning Do Retrieval-Augmented Models Add beyond LLMs? A Benchmarking Framework for Multi-Hop Inference over Hybrid Knowledge
Rank-Accuracy Trade-off for LoRA: A Gradient-Flow Analysis
ELROND: Exploring and decomposing intrinsic capabilities of diffusion models
Temper-Then-Tilt: Principled Unlearning for Generative Models through Tempering and Classifier Guidance
Internalizing Meta-Experience into Memory for Guided Reinforcement Learning in Large Language Models
Self-Evolving Recommendation System: End-To-End Autonomous Model Optimization With LLM Agents
PRISM: Differentially Private Synthetic Data with Structure-Aware Budget Allocation for Prediction
Frame-Level Internal Tool Use for Temporal Grounding in Audio LMs
Blockwise Advantage Estimation for Multi-Objective RL with Verifiable Rewards
Risk-Equalized Differentially Private Synthetic Data: Protecting Outliers by Controlling Record-Level Influence
Modeling Programming Skills with Source Code Embeddings for Context-aware Exercise Recommendation
Kernel-Based Learning of Chest X-ray Images for Predicting ICU Escalation among COVID-19 Patients
From Classical to Topological Neural Networks Under Uncertainty
Linear-LLM-SCM: Benchmarking LLMs for Coefficient Elicitation in Linear-Gaussian Causal Models
What Does Preference Learning Recover from Pairwise Comparison Data?
Configuration-to-Performance Scaling Law with Neural Ansatz
ICODEN: Ordinary Differential Equation Neural Networks for Interval-Censored Data
Confounding Robust Continuous Control via Automatic Reward Shaping
R2RAG-Flood: A reasoning-reinforced training-free retrieval augmentation generation framework for flood damage nowcasting
Stop Training for the Worst: Progressive Unmasking Accelerates Masked Diffusion Training
Identifying Evidence-Based Nudges in Biomedical Literature with Large Language Models
Theoretical Analysis of Contrastive Learning under Imbalanced Data: From Training Dynamics to a Pruning Solution
Simple LLM Baselines are Competitive for Model Diffing
Hardware Co-Design Scaling Laws via Roofline Modelling for On-Device LLMs
Deep learning outperforms traditional machine learning methods in predicting childhood malnutrition: evidence from survey data