Ax Tao Wang, Suhang Zheng, Xiaoxiao Xu 9d ago

RTMC: Step-Level Credit Assignment via Rollout Trees

Rollout tree-based credit assignment method for multi-step agentic RL, leveraging implicit state overlap between group rollouts to avoid uniform advantage assignment.

Ax Wei Li, Hangjie Yuan, Zixiang Zhao, Borui Kang, Ziwei Liu, Tao Feng 9d ago

A Faster Path to Continual Learning

Optimization technique for continual learning reducing computational overhead of C-Flat while maintaining ability to balance new and old task performance.

Ax Siyu Sun, Jing Ren, Zhaohe Liao, Dongxiao Mao, Xiangyuan Ren, Yiyi Zhang, Haohua Zhao, Weixiong Lin, Jiang Shaohua, Liqing Zhang, Yuchao Zheng 9d ago

Bottleneck Tokens for Unified Multimodal Retrieval

Bottleneck tokens framework for unified multimodal retrieval in decoder-only MLLMs, providing explicit pooling and token-level guidance for embedding alignment.

Ax Vikrant Malik, Taylan Kargin, Babak Hassibi 9d ago

Distributionally Robust K-Means Clustering

Distributionally robust variant of k-means clustering using Wasserstein-2 balls to protect against outliers, distribution shifts, and limited sample sizes.