Integrating Arithmetic Learning Improves Mathematical Reasoning in Smaller Models
Method for improving mathematical reasoning in smaller LLMs by integrating arithmetic learning with knowledge distillation and data augmentation.
Method for improving mathematical reasoning in smaller LLMs by integrating arithmetic learning with knowledge distillation and data augmentation.
Survey of edge-cloud collaborative computing paradigms for distributed AI deployment, covering model optimization and LLM inference strategies.
Framework for constraint learning using pruned neural networks as tractable surrogates in optimization problems.
CIM-Explorer tool for optimizing binary and ternary neural networks on RRAM crossbar hardware architectures.
Information Imbalance metric for analyzing semantic information alignment in deep representations across text and image models.
Methods for constructing confidence intervals and hypothesis tests for functionals derived from online/sequential algorithms with computational constraints.
BiomedSQL benchmark for evaluating text-to-SQL systems on biomedical knowledge bases requiring implicit domain reasoning and scientific understanding.
Study on how prompt variability affects LLM code generation quality and functionality across different user backgrounds and expertise levels.
Deep learning models integrated with satellite data to reconstruct global forest carbon dynamics from 1988-2021 with uncertainty quantification.
Hebbian Physics Networks: self-organizing computational architecture using plastic transport geometry for solving physical dynamics problems.
SHAP-based framework for analyzing urban exercise inequality using spatial theory and machine learning on Shenzhen street data.
QR-learner model for estimating conditional treatment effects in trials using external data.
ToolRegistry: protocol-agnostic tool management library for function-calling LLMs, addressing fragmentation in tool integration.
Analysis of GNN generalization error to explain performance variance and benchmark skew in graph neural networks.
Benchmark study showing large multimodal models fail at inductive physical reasoning beyond training distribution.
EdiVal-Agent framework for automated, fine-grained evaluation of multi-turn image editing using object-centric assessment.
Detection methods for data contamination in RL post-training phase of LLMs, addressing evaluation validity gap.
Causal discovery method using multi-environment data to achieve full causal graph identifiability with minimal environments.
CBF-RL integrates Control Barrier Functions into RL training to enforce safety constraints during policy learning.
Unified optimization framework for jointly inferring time-varying network topologies and imputing missing graph signal data.
Neighbor GRPO extends Group Relative Policy Optimization to flow matching models with contrastive ODE-based approach for generative model alignment.
Knowledge Immunization Framework for selective knowledge erasure from LLMs via representation-aware activation signatures, addressing GDPR and safety.
Deep autoencoder applied to FPUT trajectories to infer intrinsic dimensionality in weakly nonlinear regime.
Reinforcement learning approach combining Lyapunov stability constraints with soft actor-critic for safe quadrotor control.
Study demonstrating reward-free backdoor attacks on RL agents through compromised simulators.
Knowledge distillation framework for fine-grained visual classification using vision-language models with prompt-aware calibration.
Theoretical study identifying phases of matter computationally hard to learn with autoregressive neural networks.
Learnable Gaussian sampling method for inference-time scaling in latent reasoning models to improve reasoning path generation.
Analysis of gradient descent convergence with dual space preconditioners in overparameterized linear models.
Method for achieving fairness in AI systems without demographic attributes for human-centered applications.
Event-driven approach to text-to-video generation that models discrete interactions and causality rather than frame-by-frame updates.
Method for fine-tuning diffusion policies with reinforcement learning for humanoid robot loco-manipulation tasks.
Conditional flow matching framework for solving physics-constrained Bayesian inverse problems without explicit likelihood evaluation.
Pruning method for efficient large vision-language model inference by exploiting attention patterns and addressing token redundancy.
Online learning approach for supervisory switching control of partially-observed linear dynamical systems with finite-time performance bounds.
Deep learning framework combining 2.5D and 3D representations for COVID-19 detection from CT scans.
Training-free method for detecting AI-generated videos using spatial-temporal likelihood analysis without supervised training.
Framework for aggregating noisy heterogeneous evidence in probabilistic reasoning tasks with explicit uncertainty quantification.
Theoretical framework for aggregating multiple evidence sources in probabilistic prediction with formal guarantees for multi-evidence reasoning.
Machine learning method extending weak adversarial neural pushforward to solve Fokker-Planck equations on Riemannian manifolds.
Kubernetes multi-tenancy challenges with AI agents requiring ephemeral environments. Infrastructure scaling issues.
Analysis arguing software won't become disposable despite AI coding agents, critiquing concepts like 'vibe coding' and ephemeral apps.
South Korea's SDT opens first commercial quantum-AI hybrid data center in Seoul with 20-qubit Kreo quantum computer and Nvidia DGX B200 integration.
Free SEO tool for analyzing blog posts with section-by-section content refresh recommendations.
ICML 2026 policy document on violations of LLM review policies and maintaining scientific integrity in peer review.
Scheduled: Open-source AI agent integrated with Gmail that autonomously reads meeting request emails, checks calendar availability, and drafts proposed times.
Military news article about RQ-180 stealth drone emergency landing at Greek air base.
ATO: GUI control panel managing multiple LLM agents (Claude Code, Codex, OpenClaw, Hermes) with workflow orchestration and MCP integration.
LA County courts pilot AI tool (Learned Hand) to summarize legal motions and draft rulings based on judge writing styles.
Conceptual framework on how AI agents transform organizational structure and decision-making beyond efficiency gains.