Ax George Bredis, Nikita Balagansky, Daniil Gavrilov, Ruslan Rakhimov 3/4/2026

Next Embedding Prediction Makes World Models Stronger

NE-Dreamer agent uses temporal transformer to predict next-step embeddings for improved model-based reinforcement learning in high-dimensional domains.

Ax Zhenquan Yao, Zitong Huang, Yihan Zeng, Jianhua Han, Hang Xu, Chun-Mei Feng, Jianwei Ma, Wangmeng Zuo 3/4/2026

CGL: Advancing Continual GUI Learning via Reinforcement Fine-Tuning

Framework for continual learning in GUI agents using multimodal LLMs with reinforcement fine-tuning to adapt to new tasks without catastrophic forgetting.

Ax Robin Young 3/4/2026

Why Does RLAIF Work At All?

arXiv: Theoretical explanation for reinforcement learning from AI feedback through latent value hypothesis.

Ax Dan Stowell 3/4/2026

Torus embeddings

Torus embeddings: research on representing deep learning embeddings on toroidal manifolds instead of Euclidean space for efficiency.

Ax Tanishq Kumar, Tri Dao, Avner May 3/4/2026

Speculative Speculative Decoding

Novel speculative decoding technique to parallelize token verification in LLM inference, improving autoregressive decoding speed.