Dongzhi Jiang
Scholar

Dongzhi Jiang

Google Scholar ID: jIR4PAsAAAAJ
MMLab, CUHK
Citations & Impact
All-time
Citations
848
 
H-index
10
 
i10-index
11
 
Publications
16
 
Co-authors
4
list available
Resume (English only)
Academic Achievements
  • 1. T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
  • 2. MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
  • 3. EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM
  • 4. MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines
  • 5. MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine
  • 6. CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
  • 7. MoVA: Adapting Mixture of Vision Experts to Multimodal Context
  • 8. MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Research Experience
  • Currently focusing on research in multimodal large language models.
Education
  • PhD student in Multimedia Lab, CUHK, supervised by Prof. Hongsheng Li.
Background
  • Research Interests: Multimodal Large Language Model (MLLM) and Text-to-Image models.
Miscellany
  • Contact: Email, Google Scholar, Github