Haibo Qiu
Scholar

Haibo Qiu

Google Scholar ID: O5gH5vkAAAAJ
University of Sydney
Multimodal LLMVision and LanguageComputer Vision
Citations & Impact
All-time
Citations
769
 
H-index
8
 
i10-index
8
 
Publications
12
 
Co-authors
12
list available
Resume (English only)
Academic Achievements
  • Publications include 'Metis-RISE: RL Incentivizes and SFT Enhances Multimodal Reasoning Model Learning' (arXiv, 2025), 'UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding' (CVPRW, 2025), 'PointHR: Exploring High-Resolution Architectures for 3D Point Cloud Segmentation' (arXiv, 2023), 'Collect-and-Distribute Transformer for 3D Point Cloud Analysis' (arXiv, 2023), 'GFNet: Geometric Flow Network for 3D Point Cloud Semantic Segmentation' (TMLR, 2022), 'SynFace: Face Recognition with Synthetic Data' (ICCV, 2021), 'End2End Occluded Face Recognition by Masking Corrupted Features' (TPAMI, 2021), 'Cross View Fusion for 3D Human Pose Estimation' (ICCV, 2019), 'Learning Basis Representation to Refine 3D Human Pose Estimations' (AAAI, 2019). Conference and journal reviewer for multiple international conferences and journals.
Research Experience
  • 2024.04 - Present: Meituan Large Multimodal Model Group, Multimodal Researcher; 2021.04 - 2022.04: JD Explore Academy, Research intern, Advised by Dr. Baosheng Yu; 2019.05 - 2021.03: Tencent AI Lab, Research intern, Advised by Dr. Dihong Gong, Dr. Zhifeng Li, and Dr. Wei Liu; 2017.07 - 2018.12: Microsoft Research Asia (MSRA), Research intern, Advised by Dr. Chunyu Wang and Prof. Wenjun Zeng.
Education
  • Received PhD degree from the School of Computer Science, University of Sydney, advised by Prof. Dacheng Tao and co-supervised by Prof. Baosheng Yu. Obtained Bachelor's degree in the Department of Electronic Engineering and Information Science from the University of Science and Technology of China (USTC).
Background
  • Currently working at Meituan as a Researcher. Research interests include multi-modality learning, with a particular focus on mllm post-training, unified multimodal understanding and generation, and multimodal reasoning model.
Miscellany
  • Website last updated in July 2025. Modified from Leonid Keselman's and Jon Barron's websites.