Ax Rohit Kundu, Vishal Mohanty, Hao Xiong, Shan Jia, Athula Balachandran, Amit K. Roy-Chowdhury 29d ago

SAGA: Source Attribution of Generative AI Videos

SAGA framework for source attribution of AI-generated videos. Identifies specific generative model used instead of binary real/fake detection.

Ax Chengqi Dong, Chuhuai Yue, Hang He, Rongge Mao, Fenghe Tang, S Kevin Zhou, Zekun Xu, Xiaohan Wang, Jiajun Chai, Guojun Yin 29d ago

Training Multi-Image Vision Agents via End2End Reinforcement Learning

IMAgent: open-source visual agent trained with end-to-end RL for multi-image reasoning tasks, addressing limitations of single-image VLM agents.

Ax Sashuai Zhou, Qiang Zhou, Jijin Hu, Hanqing Yang, Yue Cao, Junpeng Ma, Yinchao Ma, Jun Song, Tiezheng Ge, Cheng Yu, Bo Zheng, Zhou Zhao 29d ago

Unified Thinker: A General Reasoning Modular Core for Image Generation

Open-source image generation model with improved reasoning for logic-intensive instruction following, closing gap to closed-source systems.

Ax Xiangyang Zhu, Yuan Tian, Qi Jia, Kaiwei Zhang, Zicheng Zhang, Chunyi Li, Kaiyuan Ji, Dongrui Liu, Zijian Chen, Lu Sun, Renrui Zhang, Yan Teng, Jing Shao, Wei Sun, Xia Hu, Yu Qiao, Guangtao Zhai 29d ago

SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond

SafeSci: comprehensive benchmark and framework for evaluating LLM safety in scientific domains with multi-domain risk coverage and objective evaluation.

Ax Patrice Bechard, Orlando Marquez Ayala, Emily Chen, Jordan Skelton, Sagar Davasam, Srinivas Sunkara, Vikas Yadav, Sai Rajeswar 29d ago

Terminal Agents Suffice for Enterprise Automation

Terminal agents executing enterprise tasks via CLI are simpler and more cost-effective than tool-augmented or web agents.