Ax Yulin Peng, Haowen Hou, Xinxin Zhu, Ying Tiffany He, F. Richard Yu 3/18/2026

SEMAG: Self-Evolutionary Multi-Agent Code Generation

SEMAG: self-evolutionary multi-agent code generation framework that decomposes programming tasks into planning, coding, debugging stages with adaptive workflow selection.

Ax Mateusz Dziemian, Maxwell Lin, Xiaohan Fu, Micha Nowak, Nick Winter, Eliot Jones, Andy Zou, Lama Ahmad, Kamalika Chaudhuri, Sahana Chennabasappa, Xander Davies, Lauren Deason, Benjamin L. Edelman, Tanner Emek, Ivan Evtimov, Jim Gust, Maia Hamin, Kat He, Klaudia Krawiecka, Riccardo Patana, Neil Perry, Troy Peterson, Xiangyu Qi, Javier Rando, Zifan Wang, Zihan Wang, Spencer Whitman, Eric Winsor, Arman Zharmagambetov, Matt Fredrikson, Zico Kolter 3/18/2026

How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition

Large-scale competition analysis revealing LLM agents' vulnerability to indirect prompt injection attacks through adversarial instructions in external content sources.

Ax MiroMind Team, S. Bai, L. Bing, L. Lei, R. Li, X. Li, X. Lin, E. Min, L. Su, B. Wang, L. Wang, L. Wang, S. Wang, X. Wang, Y. Zhang, Z. Zhang, G. Chen, L. Chen, Z. Cheng, Y. Deng, Z. Huang, D. Ng, J. Ni, Q. Ren, X. Tang, B. L. Wang, H. Wang, N. Wang, C. Wei, Q. Wu, J. Xia, Y. Xiao, H. Xu, X. Xu, C. Xue, Z. Yang, Z. Yang, F. Ye, H. Ye, J. Yu, C. Zhang, W. Zhang, H. Zhao, P. Zhu 3/18/2026

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

MiroThinker-1.7 and H1: research agents with enhanced verification and multi-step reasoning via structured planning and contextual reasoning for long-horizon tasks.

Ax Yihao Zhang, Zeming Wei, Xiaokun Luan, Chengcan Wu, Zhixin Zhang, Jiangrong Wu, Haolin Wu, Huanran Chen, Jun Sun, Meng Sun 3/18/2026

ClawWorm: Self-Propagating Attacks Across LLM Agent Ecosystems

ClawWorm: first documented self-propagating attack across LLM agent ecosystems, demonstrating security vulnerabilities in OpenClaw platform with 40,000+ active instances.

Ax Aleph Alpha, :, Adnen Abdessaied, Artur Baranowski, Lukas Balles, Michael Barlow, Fabien C. Y. Benureau, Felix Berkenkamp, Lukas Bluebaum, Bastian Boll, Thomas F. Burns, Bj\"orn Deiseroth, Constantin Eichenberg, David Friede, Pablo Iyu Guerrero, Ahmed Hammam, Bastian Harren, Johann Higl, Yasser Jadidi, Carina Kauf, Johannes Messner, Jan Hendrik Metzen, Max Meuer, Vedant Nanda, Pit Neitemeier, Koen Oostermeijer, Letitia Parcalabescu, Markus Pernpointner, Felix Reinfurt, Dylan Rodriquez, Gr\'egory Schott, Philipp Siedler, Martin Simonovsky, Till Speicher, Volker Stampa, Stephan W\"aldchen, Samuel Weinbach, Gregor Ziegltrum 3/18/2026

A Family of LLMs Liberated from Static Vocabularies

arXiv: LLM family with dynamic tokenizers eliminating fixed vocabulary constraints, up to 70B parameters, improved domain/language adaptation.

Ax Hanxian Huang, Igor Fedorov, Andrey Gromov, Bernard Beckerman, Naveen Suda, David Eriksson, Maximilian Balandat, Rylan Conway, Patrick Huber, Chinnadhurai Sankar, Ayushi Dalmia, Zechun Liu, Lemeng Wu, Tarek Elgamal, Adithya Sagar, Vikas Chandra, Raghuraman Krishnamoorthi 3/18/2026

MobileLLM-Flash: Latency-Guided On-Device LLM Design for Industry Scale

MobileLLM-Flash methodology designs on-device LLMs optimized for latency constraints using hardware-in-the-loop architecture search.

Ax Callen MacPhee, Yiming Zhou, Koichiro Kishima, Bahram Jalali 3/18/2026

Standardizing Medical Images at Scale for AI

Physics-based preprocessing framework standardizes heterogeneous medical images at scale for improved model generalization.

Ax Atharva Sehgal, James Hou, Akanksha Sarkar, Ishaan Mantripragada, Swarat Chaudhuri, Jennifer J. Sun, Yisong Yue 3/18/2026

Evaluating Agentic Optimization on Large Codebases

FormulaCode benchmark evaluates LLM coding agents on repository-level codebase optimization with realistic multi-objective constraints.