Ax Zixuan Hu, Yongxian Wei, Li Shen, Zhenyi Wang, Baoyuan Wu, Chun Yuan, Dacheng Tao 3d ago

Task-Distributionally Robust Data-Free Meta-Learning

Data-free meta-learning robustness analysis examining failure modes when learning from pre-trained models without training data.

Ax Xin He, Wenqi Fan, Yili Wang, Chengyi Liu, Rui Miao, Xin Juan, Xin Wang 3d ago

Graph Defense Diffusion Model

Diffusion model approach for defending graph neural networks against adversarial attacks.

Ax ShengYun Peng, Eric Smith, Ivan Evtimov, Song Jiang, Pin-Yu Chen, Hongyuan Zhan, Haozhu Wang, Duen Horng Chau, Mahesh Pasupuleti, Jianfeng Chi 3d ago

Large Reasoning Models Learn Better Alignment from Flawed Thinking

RECAP: RL method for safety alignment in large reasoning models, teaching critical evaluation of flawed premises via counter-aligned prefilling.

Ax Zhaoyang Zhang, Shuli Jiang, Yantao Shen, Yuting Zhang, Dhananjay Ram, Shuo Yang, Zhuowen Tu, Wei Xia, Stefano Soatto 3d ago

Reinforcement-aware Knowledge Distillation for LLM Reasoning

Reinforcement-aware knowledge distillation method for distilling RL-trained reasoning LLMs into smaller models while preserving chain-of-thought capability.

Ax Uzay Macar, Li Yang, Atticus Wang, Peter Wallich, Emmanuel Ameisen, Jack Lindsey 3d ago

Mechanisms of Introspective Awareness

Investigates mechanisms of introspective awareness in LLMs, where models detect injected steering vectors with minimal false positives.