Conghui He
Scholar

Conghui He

Google Scholar ID: PopTv7kAAAAJ
Shanghai AI Laboratory
Data-centric AILLMDocument Intelligence
Citations & Impact
All-time
Citations
9,923
 
H-index
34
 
i10-index
84
 
Publications
20
 
Co-authors
13
list available
Resume (English only)
Academic Achievements
  • Author of over 150 papers in top-tier venues, with over 9,000 citations on Google Scholar. Recipient of the Gordon Bell Prize, ACL Best Theme Paper Award, and WAIC Yunfan Award. Multiple papers accepted by NIPS 2025, ICCV 2025, ACL 2025, CVPR 2025, NACCL 2025, and ICLR 2025.
Research Experience
  • Young Leading Scientist at Shanghai AI Lab, overseeing a dedicated data team that curates high-quality datasets for leading models such as InternLM and InternVL. Creator of the world's leading open-source data engine, MinerU.
Education
  • Ph.D. from Tsinghua University; Visiting Researcher at Stanford University and Imperial College London
Background
  • Young Leading Scientist at Shanghai AI Lab and Adjunct Doctoral Supervisor at School of AI, SJTU. Recognized as a National-level Young Talent, he holds a Ph.D. from Tsinghua University and was a visiting researcher at Stanford University and Imperial College London. His research focuses on data-centric AI and high-performance computing.
Miscellany
  • Creator of open-source projects including MinerU, InternLM, and OpenDataLab.