Ax Ga\"etan Hadjeres, Marc Ferras, Khaled Koutini, Benno Weck, Alexandre Bittar, Thomas Hummel, Zineb Lahrici, Hakim Missoum, Joan Serr\`a, Yuki Mitsufuji 8d ago

Woosh: A Sound Effects Foundation Model

Open source sound effect foundation model from Sony AI with audio encoder/decoder and text-to-audio capabilities.

Ax Dun Yuan, Fuyuan Lyu, Ye Yuan, Weixu Zhang, Bowei He, Jiayi Geng, Linfeng Du, Zipeng Sun, Yankai Chen, Changjiang Han, Jikun Kang, Xi Chen, Haolun Wu, Xue Liu 8d ago

Beyond Message Passing: A Semantic View of Agent Communication Protocols

Framework analyzing agent communication protocols for LLM systems across three layers: communication, syntactic, and semantic. Systematically organizes 18 representative protocols.

Ax Daniele Solombrino, Antonio Andrea Gargiulo, Adrian Robert Minut, Luca Zhou, Alessandro Zirilli, Emanuele Rodol\`a 8d ago

Zero-Shot Quantization via Weight-Space Arithmetic

Zero-shot quantization method using weight-space arithmetic to improve post-training quantization robustness across models.

Ax Manish Bhatt, Sarthak Munshi, Vineeth Sai Narajala, Idan Habler, Ammar Al-Kahfah, Ken Huang, Joel Webb, Blake Gatto, Md Tamjidul Hoque 8d ago

The Defense Trilemma: Why Prompt Injection Defense Wrappers Fail?

Theoretical analysis proving limitations of continuous wrapper defenses against prompt injection attacks in LLMs.

Ax Xiangru Jian, Hao Xu, Wei Pang, Xinjian Zhao, Chengyu Tao, Qixin Zhang, Xikun Zhang, Chao Zhang, Guanzhi Deng, Alex Xue, Juan Du, Tianshu Yu, Garth Tarr, Linqi Song, Qiuzhuang Sun, Dacheng Tao 8d ago

FORGE: Fine-grained Multimodal Evaluation for Manufacturing Scenarios

Fine-grained benchmark evaluating multimodal LLMs on manufacturing scenarios.

Ax Peng Wang (The Chinese University of Hong Kong, Shenzhen), Yanqiao Zhu (X-LANCE Lab, Shanghai Jiao Tong University), Zixuan Jiang (Xi'an Jiaotong University), Qinyuan Chen (Fudan University), Xingjian Zhao (Fudan University), Xipeng Qiu (Fudan University), Wupeng Wang (Tongyi Fun Team, Alibaba Group), Zhifu Gao (Tongyi Fun Team, Alibaba Group), Xiangang Li (Tongyi Fun Team, Alibaba Group), Kai Yu (X-LANCE Lab, Shanghai Jiao Tong University), Xie Chen (X-LANCE Lab, Shanghai Jiao Tong University) 8d ago

Interactive ASR: Towards Human-Like Interaction and Semantic Coherence Evaluation for Agentic Speech Recognition

Interactive ASR system with semantic coherence evaluation and human-like correction mechanisms.

Ax Jingyu Zhang, Tianjian Li, William Jurayj, Hongyuan Zhan, Benjamin Van Durme, Daniel Khashabi 8d ago

Many-Tier Instruction Hierarchy in LLM Agents

Framework for managing hierarchical instruction conflicts in multi-source LLM agent environments.

Ax Julio Candanedo 8d ago

The Diffusion-Attention Connection

Theoretical connection between Transformers, diffusion maps, and magnetic Laplacians through Markov geometry.

Ax Hua-Dong Xiong (School of Psychological and Brain Sciences, Georgia Tech), Li Ji-An (Department of Psychology, New York University), Jiaqi Huang (Department of Cognitive Science, Indiana University Bloomington, Honda Research Institute), Robert C. Wilson (School of Psychological and Brain Sciences, Georgia Tech, Center of Excellence for Computational Cognition, Georgia Tech), Kwonjoon Lee (Honda Research Institute), Xue-Xin Wei (Departments of Neuroscience and Psychology, The University of Texas at Austin) 8d ago

Human-like Working Memory Interference in Large Language Models

Analysis of working memory limitations in LLMs and comparison with biological systems.

Ax Vijay Lingam, Aditya Golatkar, Anwesan Pal, Ben Vo, Narayanan Sadagopan, Alessandro Achille, Jun Huan, Anoop Deoras, Stefano Soatto 8d ago

ExecTune: Effective Steering of Black-Box LLMs with Guide Models

Guide-Core Policies framework for black-box LLM agents where guide models generate structured strategies executed by core models reducing inference costs.