Ax Taiwei Shi, Sihao Chen, Bowen Jiang, Linxin Song, Longqi Yang, Jieyu Zhao 2/17/2026

Experiential Reinforcement Learning

Training paradigm embedding experience replay in reinforcement learning for LMs to learn from sparse, delayed environmental feedback.

Ax Nicolas Zumarraga, Thomas Kaar, Ning Wang, Maxwell A. Xu, Max Rosenblattl, Markus Kreft, Kevin O'Sullivan, Paul Schmiedmayer, Patrick Langer, Robert Jakob 2/17/2026

TS-Haystack: A Multi-Scale Retrieval Benchmark for Time Series Language Models

TS-Haystack benchmark evaluates time series language models on long-context retrieval with millions of datapoints, requiring precise temporal localization.

Ax Yaxuan Kong, Hoyoung Lee, Yoontae Hwang, Alejandro Lopez-Lira, Bradford Levy, Dhagash Mehta, Qingsong Wen, Chanyeol Choi, Yongjae Lee, Stefan Zohren 2/17/2026

Evaluating LLMs in Finance Requires Explicit Bias Consideration

Analysis identifying five recurring biases in financial LLM applications: look-ahead, survivorship, narrative, objective, and cost bias that invalidate deployment claims.