Ax Jiayang Gao, Tianyi Zheng, Jiayang Zou, Fengxiang Yang, Shice Liu, Luyao Fan, Zheyu Zhang, Hao Zhang, Jinwei Chen, Peng-Tao Jiang, Bo Li, Jia Wang 22d ago

C$^2$FG: Control Classifier-Free Guidance via Score Discrepancy Analysis

Theoretical analysis of classifier-free guidance in diffusion models with bounds on score discrepancy for controlled guidance weights.

Ax Antoni Kowalczuk, Jan Dubi\'nski, Franziska Boenisch, Adam Dziedzic 22d ago

Privacy Attacks on Image AutoRegressive Models

Comprehensive privacy attack analysis on image autoregressive models, identifying membership inference and extraction vulnerabilities.

Ax Charig Yang, Samiul Alam, Shakhrul Iman Siam, Michael J. Proulx, Lambert Mathias, Kiran Somasundaram, Luis Pesqueira, James Fort, Sheroze Sheriffdeen, Omkar Parkhi, Carl Ren, Mi Zhang, Yuning Chai, Richard Newcombe, Hyo Jin Kim 22d ago

Reading Recognition in the Wild

Task and dataset for detecting when users are reading in egocentric smart glasses video using multimodal models.

Ax Hsien-Chin Lin, Benjamin Matthias Ruppik, Carel van Niekerk, Chia-Hao Shen, Michael Heck, Nurul Lubis, Renato Vukovic, Shutong Feng, Milica Ga\v{s}i\'c 22d ago

Prompt reinforcing for long-term planning of large language models

Method to improve LLM performance in multi-turn conversations by reinforcing long-term planning and goal tracking through prompting.

Ax Xi Zhang, Hanwei Zhu, Yan Zhong, Jiamang Wang, Weisi Lin 22d ago

BADiff: Bandwidth Adaptive Diffusion Model

Framework enabling diffusion models to adapt generation quality based on real-time network bandwidth constraints in cloud-to-device scenarios.

Ax Bhuvan Sachdeva, Karan Uppal, Abhinav Java, Vineeth N. Balasubramanian 22d ago

Understanding Task Transfer in Vision-Language Models

Study of task transfer in Vision-Language Models examining how finetuning on one perception task affects performance on others.