Ziyu Guo
Scholar

Ziyu Guo

Google Scholar ID: S9GLetwAAAAJ
The Chinese University of Hong Kong
Multi-modality LearningLLM/VLMs3D Vision
Citations & Impact
All-time
Citations
3,872
 
H-index
25
 
i10-index
30
 
Publications
20
 
Co-authors
2
list available
Resume (English only)
Academic Achievements
  • - CoT/CoF Reasoning for Visual Generation (arXiv)
  • - Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark (Technical Report)
  • - Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step (Under Review)
  • - T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT (NeurIPS 2025)
  • - Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO (NeurIPS 2025)
  • - SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific Problems (ACL 2025)
  • - MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency (ICML 2025)
  • - MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine (ICLR 2025)
  • - MathVerse: Does your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? (ECCV 2024)
  • - MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines (ICLR 2025)
  • - Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following (Under Review)
  • - Exploring the Potential of Encoder-free Architectures in 3D LMMs (Under Review)
  • - SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners (Technical Report)
  • - PointCLIP: Point Cloud Understanding (CVPR 2022)
Research Experience
  • - Research Intern at Meta
  • - Research Intern at Amazon AWS AI Lab
  • - Research Intern at Roblox
  • - Research Intern at Tencent
  • - Research Intern at Shanghai AI Laboratory
Education
  • - Ph.D. Candidate, The Chinese University of Hong Kong, Department of Computer Science and Engineering, Supervisor: Prof. Pheng-Ann Heng
  • - Bachelor’s Degree, Peking University, Computer Science, Supervisor: Prof. Bin Cui
Background
  • - Research Interests: Multi-modal Learning, Large Language/Vision Models, and 3D Vision
  • - Professional Field: Computer Science and Engineering