Ax Mehran Taghian, Yunke Peng, Xing Huang, Yao Wang, Yaoyuan Wang, Wei Guo, Yuanyong Luo, Tianchi Hu, Junsong Wang, Xin Wang, Hu Liu, Yu Cheng, Ziwei Yu, Hongliang Li, Mehdi Rahimifar, Lei Yan, Xuefei Wang, Zhuang Ma, Lei Liu, Hui Yu, Anandharaju Durai Raju, Hoang Le, Hei Yi Mak, Tanzila Rahman, Shadan Golestan 2d ago

HiFloat4 Format for Language Model Pre-training on Ascend NPUs

4-bit floating-point format (HiFloat4) for efficient language model pre-training on Ascend NPU hardware.

Ax Chia-Hong Hsu, Frank Wood 2d ago

Discrete Meanflow Training Curriculum

Training curriculum method for discrete flow-based image generation models to improve one-step sampling stability and quality.

Ax Amrut Nadgir, Vijay Balasubramanian, Pratik Chaudhari 2d ago

How does Chain of Thought decompose complex tasks?

Demonstrates power-law scaling of classification error with number of classes and how chain-of-thought decomposition reduces error through task splitting.

Ax Vladim\'ir Hol\'y, Michal \v{C}ern\'y 2d ago

Score-Driven Rating System for Sports

arXiv paper proposing score-driven rating system extending classical Elo rating to accommodate diverse game outcomes and rankings.