Scholar

Dongzhi Jiang

Google Scholar ID: jIR4PAsAAAAJ

MMLab, CUHK

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

848

H-index

10

i10-index

11

Publications

16

Co-authors

4

list available

Contact

Emailjdzcarr7@gmail.com GitHubOpen ↗

Publications

15 items

GenClaw: Code-Driven Agentic Image Generation

2026

Cited

0

CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation

2026

Cited

0

Mind-Brush: Integrating Agentic Cognitive Search and Reasoning into Image Generation

2026

Cited

0

Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation

2025

Cited

0

DraCo: Draft as CoT for Text-to-Image Preview and Rare Concept Generation

2025

Cited

0

RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards

2025

Cited

0

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark

2025

Cited

0

BLINK-Twice: You see, but do you observe? A Reasoning Benchmark on Visual Perception

2025

Cited

0

Resume (English only)

Academic Achievements

1. T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
2. MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
3. EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM
4. MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines
5. MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine
6. CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
7. MoVA: Adapting Mixture of Vision Experts to Multimodal Context
8. MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Research Experience

Currently focusing on research in multimodal large language models.

Education

PhD student in Multimedia Lab, CUHK, supervised by Prof. Hongsheng Li.

Background

Research Interests: Multimodal Large Language Model (MLLM) and Text-to-Image models.

Miscellany

Contact: Email, Google Scholar, Github

Co-authors

4 total

Hongsheng Li (李鸿升)

The Chinese University of Hong Kong

Seed ByteDance & MMLab & PKU

MMLab, The Chinese University of Hong Kong

Unknown affiliation