Published several papers, including AsynDM (arXiv 2025), Follow-Your-Preference (arXiv 2025), Follow-Your-Emoji-Faster (IJCV 2025), and Hunyuan-Game (arXiv 2025). Notably, HunyuanVideo is a systematic framework for large video generative models, featuring an open-source diffusion model with 13B parameters, which has over 500 citations and over 11,000 GitHub stars as of November 2025.
Research Experience
Sep 2023 - Nov 2025, interned and then worked in the Hunyuan Multimodal Generation Foundation Model Team at Tencent, collaborating with Wei Liu, Liefeng Bo, and Zhao Zhong; Jul 2022 - Aug 2023, interned in the Computer Vision Group at Baidu, collaborating with Xinyu Zhang and Jingdong Wang.
Education
2019-2024, Ph.D. in Computer Science from Zhejiang University, supervised by Professors Kun Kuang, Lanfen Lin, and Fei Wu; 2015-2019, B.E. in Automation from Zhejiang University of Technology, supervised by Professors Qi Xuan and Li Yu.
Background
Research interests include multimodal generative foundation models and visual self-supervised pre-training. Currently working in the Hunyuan Multimodal Generation Foundation Model Team at Tencent, focusing on multimodal generative foundation models.
Miscellany
Worked closely with friends such as Defang Chen and Yue Ma, whose insights have profoundly shaped his approach to research.