Yuanhan Zhang
Scholar

Yuanhan Zhang

Google Scholar ID: g6grFy0AAAAJ
PhD Candidate, MMLab@NTU
Computer VisionMachine Learning
Citations & Impact
All-time
Citations
8,103
 
H-index
21
 
i10-index
27
 
Publications
20
 
Co-authors
23
list available
Resume (English only)
Academic Achievements
  • July 2025: Released Video Thinking Test (Video-TT), a holistic benchmark to assess advanced reasoning and understanding correctness/robustness between MLLMs and humans.
  • October 2024: Updated LLaVA-Video (formerly LLaVA-NeXT-Video), releasing both the model and the data.
  • August 2024: Released LLaVA-OneVision, an LMM that excels across single-image, multi-image, and video tasks.
  • July 2024: IJCV Outstanding Reviewer Award 2023.
  • July 2024: NOAH accepted by TPAMI.
  • July 2024: Three papers accepted at ECCV 2024.
  • June 2024: Organized CVPR 2024 workshop: Prompting in Vision.
  • May 2024: Released LLaVA-NeXT-Video.
  • September 2023: Visual Prompt Retrieval accepted to NeurIPS 2023.
  • September 2023: Talk at Alibaba DAMO Academy, hosted by Dr. Lidong Bin.
  • July 2023: Talk at HITSZ, hosted by Prof. Rui Shao.
  • June 2023: Introducing Otter.
  • October 2022: 1st place in Computer Vision in the Wild Challenge.
  • July 2022: OmniBenchmark accepted to ECCV 2022.
  • March 2022: Bamboo dataset released.
Research Experience
  • Involved in multiple research projects such as Video Thinking Test (Video-TT), LLaVA-Video, LLaVA-OneVision, and published papers at several international conferences.
Education
  • Third-year PhD student at MMLab@NTU, supervised by Prof. Ziwei Liu.
Background
  • Research Interests: Computer vision and deep learning. Focuses on adapting foundation models from vision to multimodal for real-world use, including benchmarking model performance and adapting models via parameter-efficient tuning, in-context learning, and instruction tuning.
Miscellany
  • Contact: yuanhan002@e.ntu.edu.sg / Google Scholar / Twitter / GitHub