Chuofan Ma
Scholar

Chuofan Ma

Google Scholar ID: hgKtgWAAAAAJ
PhD student of Electrical and Electronic Engineering, The University of Hong Kong
Computer VisionMachine Learning
Citations & Impact
All-time
Citations
255
 
H-index
7
 
i10-index
6
 
Publications
9
 
Co-authors
5
list available
Resume (English only)
Academic Achievements
  • UniTok: A Unified Tokenizer for Visual Generation and Understanding, NeurIPS (Spotlight), 2025
  • Liquid: Language Models are Scalable and Unified Multi-modal Generators, arxiv preprint, 2024
  • Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Generation, NeurIPS, 2025
  • Learning from Neighbors: Category Extrapolation for Long-Tail Learning, CVPR, 2025
  • Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models, ECCV, 2024
  • CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection, NeurIPS, 2023
  • Recognize Any Regions, NeurIPS, 2024
  • EGC: Image Generation and Classification via a Diffusion Energy-Based Model, ICCV, 2023
  • Rethinking Resolution in the Context of Efficient Video Recognition, NeurIPS, 2022
Research Experience
  • Involved in multiple research projects including UniTok, Liquid, etc.
Education
  • Bachelor's degree in Computer Science from The University of Hong Kong (HKU); currently a Ph.D. student at CVMI Lab, HKU, supervised by Prof. Xiaojuan Qi.
Background
  • Primary research interest lies in open-world visual intelligence and multi-modal foundation models. Open to collaboration opportunities.
Miscellany
  • Website template from Jon Barron.