Zhuofan Zong
Scholar

Zhuofan Zong

Google Scholar ID: vls0YhoAAAAJ
MMLab, The Chinese University of Hong Kong
Large ModelsMultimodalObject Detection3D Object Detection
Citations & Impact
All-time
Citations
1,499
 
H-index
13
 
i10-index
15
 
Publications
19
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • - Publications:
  • - VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping (2024, arXiv)
  • - EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM (2025, ICML)
  • - MoVA: Adapting Mixture of Vision Experts to Multimodal Context (2024, NeurIPS)
  • - Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models (2024, NeurIPS)
  • - Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning (2024, NeurIPS, Spotlight Presentation)
Research Experience
  • - Research Intern at the Base Model Department at SenseTime Research, working closely with Guanglu Song and Yu Liu
  • - Core member of the founding team for frontline R&D projects, including the large vision foundation model, the multimodal interactive model, and the AIGC product SenseMirage
Education
  • - Ph.D.: The Chinese University of Hong Kong, MMLab, Advisor: Prof. Hongsheng Li
  • - Master's Degree: Beihang University, Advisor: Prof. Biao Leng
  • - Bachelor's Degree: Beihang University, Advisor: Prof. Biao Leng
Background
  • - Research Interests: Generative AI, particularly in diffusion models and multimodal large language models
  • - Professional Field: Visual content generation, multimodal understanding
  • - Brief Introduction: A third-year Ph.D. student from MMLab, The Chinese University of Hong Kong, supervised by Prof. Hongsheng Li. Received both Bachelor's and Master's degrees from Beihang University, supervised by Prof. Biao Leng.
Miscellany
  • - Personal Interests: Not provided in detail
Co-authors
0 total
Co-authors: 0 (list not available)