Published papers: Accepted in NeurIPS2025, CVPR2025, ECCV2024, IJCV, etc. Awards: Outstanding Reviewer at CVPR2025, Distinguished Paper at EgoVis CVPR2025, Best Student Paper Prize at BMVC, etc.
Research Experience
Research Scientist Intern at Meta Superintelligence Labs, Multimedia Core Video Generation Team (May 2025 – Present), working on analyzing and improving the diffusibility of high-dimensional latent space and engineering experience on large-scale MovieGen codebase and distributed training.
Education
PhD: Machine Learning Program at Georgia Institute of Technology, advised by Prof. James Rehg and co-advised by Prof. Zsolt Kira. Master's: ECE from Shanghai Jiao Tong University, advised by Prof. Ya Zhang. Bachelor's: Information Engineering from Shanghai Jiao Tong University.
Background
Research interests: Multimodal Learning, including Multimodal Understanding (e.g., VLMs, MLLMs) and Image/Video Generation (e.g., diffusion, flow matching). Career goal is to build omni multimodal systems that can understand, reason, and generate across text, image, video, and audio.
Miscellany
Looking for a full-time Research Scientist / Applied Scientist / ML Engineer position (available starting Dec. 2025). Open to collaborations with motivated graduate/undergraduate students.