Ax Xiangru Jian, Hao Xu, Wei Pang, Xinjian Zhao, Chengyu Tao, Qixin Zhang, Xikun Zhang, Chao Zhang, Guanzhi Deng, Alex Xue, Juan Du, Tianshu Yu, Garth Tarr, Linqi Song, Qiuzhuang Sun, Dacheng Tao 8d ago

FORGE: Fine-grained Multimodal Evaluation for Manufacturing Scenarios

Fine-grained benchmark evaluating multimodal LLMs on manufacturing scenarios.

Ax Peng Wang (The Chinese University of Hong Kong, Shenzhen), Yanqiao Zhu (X-LANCE Lab, Shanghai Jiao Tong University), Zixuan Jiang (Xi'an Jiaotong University), Qinyuan Chen (Fudan University), Xingjian Zhao (Fudan University), Xipeng Qiu (Fudan University), Wupeng Wang (Tongyi Fun Team, Alibaba Group), Zhifu Gao (Tongyi Fun Team, Alibaba Group), Xiangang Li (Tongyi Fun Team, Alibaba Group), Kai Yu (X-LANCE Lab, Shanghai Jiao Tong University), Xie Chen (X-LANCE Lab, Shanghai Jiao Tong University) 8d ago

Interactive ASR: Towards Human-Like Interaction and Semantic Coherence Evaluation for Agentic Speech Recognition

Interactive ASR system with semantic coherence evaluation and human-like correction mechanisms.

Ax Jingyu Zhang, Tianjian Li, William Jurayj, Hongyuan Zhan, Benjamin Van Durme, Daniel Khashabi 8d ago

Many-Tier Instruction Hierarchy in LLM Agents

Framework for managing hierarchical instruction conflicts in multi-source LLM agent environments.

Ax Julio Candanedo 8d ago

The Diffusion-Attention Connection

Theoretical connection between Transformers, diffusion maps, and magnetic Laplacians through Markov geometry.

Ax Hua-Dong Xiong (School of Psychological and Brain Sciences, Georgia Tech), Li Ji-An (Department of Psychology, New York University), Jiaqi Huang (Department of Cognitive Science, Indiana University Bloomington, Honda Research Institute), Robert C. Wilson (School of Psychological and Brain Sciences, Georgia Tech, Center of Excellence for Computational Cognition, Georgia Tech), Kwonjoon Lee (Honda Research Institute), Xue-Xin Wei (Departments of Neuroscience and Psychology, The University of Texas at Austin) 8d ago

Human-like Working Memory Interference in Large Language Models

Analysis of working memory limitations in LLMs and comparison with biological systems.

Ax Vijay Lingam, Aditya Golatkar, Anwesan Pal, Ben Vo, Narayanan Sadagopan, Alessandro Achille, Jun Huan, Anoop Deoras, Stefano Soatto 8d ago

ExecTune: Effective Steering of Black-Box LLMs with Guide Models

Guide-Core Policies framework for black-box LLM agents where guide models generate structured strategies executed by core models reducing inference costs.

Ax Smita Deb, Shirin Panahi, Mulugeta Haile, Ying-Cheng Lai 8d ago

Vestibular reservoir computing

Physical reservoir computing inspired by biological vestibular system addressing hardware complexity with designed uncoupled topology.

Ax Zunhai Su, Hengyuan Zhang, Wei Wu, Yifan Zhang, Yaxiu Liu, He Xiao, Qingyao Yang, Yuxuan Sun, Rui Yang, Chao Zhang, Keyu Fan, Weihao Ye, Jing Xiong, Hui Shen, Chaofan Tao, Taiqiang Wu, Zhongwei Wan, Yulei Qian, Yuchen Xie, Ngai Wong 8d ago

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Survey on attention sink phenomenon in transformers, covering utilization, interpretation, and mitigation strategies.