Ax Haoran Sun, Yongjian Guo, Zhong Guan, Shuai Di, Xiaodong Bai, Jing Long, Tianyun Zhao, Mingxi Luo, Hongke Zhao, Likang Wu, Xiaotie Deng, Xu Chu, Xi Xiao, Sheng Wen, Yicheng Gong, Junwu Xiong 25d ago

RL-VLA$^3$: A Flexible and Asynchronous Reinforcement Learning Framework for VLA Training

Asynchronous reinforcement learning framework for vision-language-action model training, enabling flexible post-training optimization for embodied agents.

Ax Xue Liu, Xin Ma, Yuxin Ma, Yongchang Peng, Duo Wang, Zhoufutu Wen, Ge Zhang, Kaiyuan Zhang, Xinyu Chen, Tianci He, Jiani Hou, Liang Hu, Ziyun Huang, Yongzhe Hui, Jianpeng Jiao, Chennan Ju, Yingru Kong, Yiran Li, Mengyun Liu, Luyao Ma, Fei Ni, Yiqing Ni, Yueyan Qiu, Yanle Ren, Zilin Shi, Zaiyuan Wang, Wenjie Yue, Shiyu Zhang, Xinyi Zhang, Kaiwen Zhao, Zhenwei Zhu, Shanshan Wu, Qi Zhao, Wenhao Huang 25d ago

Xpertbench: Expert Level Tasks with Rubrics-Based Evaluation

High-fidelity benchmark with rubrics-based evaluation assessing LLMs on expert-level complex open-ended tasks across multiple domains.

Ax Saad Alqithami 25d ago

Soft Tournament Equilibrium

Theoretical framework for evaluating cyclic non-transitive interactions between LLM-based agents using equilibrium concepts instead of linear rankings.

Ax Jingyang Qiao, Weicheng Meng, Yu Cheng, Zhihang Lin, Zhizhong Zhang, Xin Tan, Jingyu Gong, Kun Shao, Yuan Xie 25d ago

Memory Intelligence Agent

Memory system for deep research agents that improves trajectory retrieval and memory evolution to enhance LLM reasoning and autonomous learning.

Ax Kutay Tire, Ege Onur Taga, Muhammed Emrullah Ildiz, Samet Oymak 25d ago

Retrieval Augmented Time Series Forecasting

Retrieval-augmented generation applied to time-series foundation models for zero-shot forecasting across domains.

Ax Hammad Ayyubi, Junzhang Liu, Ali Asgarov, Zaber Ibn Abdul Hakim, Najibul Haque Sarker, Zhecan Wang, Chia-Wei Tang, Hani Alomari, Md. Atabuzzaman, Xudong Lin, Naveen Reddy Dyava, Shih-Fu Chang, Chris Thomas 25d ago

ENTER: Event Based Interpretable Reasoning for VideoQA

ENTER system uses event graphs for interpretable Video QA with code generation and contextual reasoning.

Ax Chaofan Pan, Xin Yang, Yanhua Li, Wei Wei, Tianrui Li, Bo An, Jiye Liang 25d ago

A Survey of Continual Reinforcement Learning

Survey of continual reinforcement learning covering sequential decision-making, generalization, and adaptation across dynamic tasks.