Scholar

Rongyao Fang (方荣耀)

Google Scholar ID: FtH3CW4AAAAJ

MMLab, The Chinese University of Hong Kong

Artificial General IntelligenceDeep LearningMultimodal

Citations & Impact

All-time

Citations

3,523

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

19 items

Browse publications on Google Scholar (top-right) ↗

Resume (English only)

Academic Achievements

Published multiple papers at top-tier venues including NeurIPS, CVPR, ICCV, ECCV, TPAMI, and IJCV
Key works include: GoT (NeurIPS 2025), Puma (ICCV 2025), SOLVE (CVPR 2025), FeatAug-DETR (TPAMI 2024), FouriScale (ECCV 2024)
Co-developed FLUX-Reason-6M, a million-scale text-to-image reasoning dataset, and PRISM-Bench benchmark (2025)
Proposed CodePlot-CoT for mathematical visual reasoning using code-driven images (2025)
Served as first or co-first author on multiple influential publications, advancing visual generation and multimodal reasoning

Background

Ph.D. candidate at the Multimedia Laboratory (MMLab), The Chinese University of Hong Kong (CUHK), expected to graduate in 2025
Research driven by a passion for Artificial General Intelligence (AGI), with a focus on visual understanding and generation
Dedicated to building integrated systems capable of perceiving, understanding, and generating visual content using advanced multimodal large language models
Supervised by Prof. Hongsheng Li and closely collaborating with Prof. Xihui Liu during Ph.D.
Former visiting scholar at MIT CSAIL, advised by Prof. Dina Katabi

Co-authors

16 total