Scholar

Haibo Qiu

Google Scholar ID: O5gH5vkAAAAJ

University of Sydney

Multimodal LLMVision and LanguageComputer Vision

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

769

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailhaibo-qiu@outlook.com GitHubOpen ↗LinkedInOpen ↗

Publications

22 items

AIR: Adaptive Interleaved Reasoning with Code in MLLMs

2026

Cited

Beyond NL2Code: A Structured Survey of Multimodal Code Intelligence

2026

Cited

Seirênes: Adversarial Self-Play with Evolving Distractions for LLM Reasoning

2026

Cited

SkillFlow:Benchmarking Lifelong Skill Discovery and Evolution for Autonomous Agents

2026

Cited

Omni-I2C: A Holistic Benchmark for High-Fidelity Image-to-Code Generation

2026

Cited

Flexible Entropy Control in RLVR with Gradient-Preserving Perspective

2026

Cited

TreeCUA: Efficiently Scaling GUI Automation with Tree-Structured Verifiable Evolution

2026

Cited

Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR

2026

Cited

Resume (English only)

Academic Achievements

Publications include 'Metis-RISE: RL Incentivizes and SFT Enhances Multimodal Reasoning Model Learning' (arXiv, 2025), 'UniToken: Harmonizing Multimodal Understanding and Generation through Unified Visual Encoding' (CVPRW, 2025), 'PointHR: Exploring High-Resolution Architectures for 3D Point Cloud Segmentation' (arXiv, 2023), 'Collect-and-Distribute Transformer for 3D Point Cloud Analysis' (arXiv, 2023), 'GFNet: Geometric Flow Network for 3D Point Cloud Semantic Segmentation' (TMLR, 2022), 'SynFace: Face Recognition with Synthetic Data' (ICCV, 2021), 'End2End Occluded Face Recognition by Masking Corrupted Features' (TPAMI, 2021), 'Cross View Fusion for 3D Human Pose Estimation' (ICCV, 2019), 'Learning Basis Representation to Refine 3D Human Pose Estimations' (AAAI, 2019). Conference and journal reviewer for multiple international conferences and journals.

Research Experience

2024.04 - Present: Meituan Large Multimodal Model Group, Multimodal Researcher; 2021.04 - 2022.04: JD Explore Academy, Research intern, Advised by Dr. Baosheng Yu; 2019.05 - 2021.03: Tencent AI Lab, Research intern, Advised by Dr. Dihong Gong, Dr. Zhifeng Li, and Dr. Wei Liu; 2017.07 - 2018.12: Microsoft Research Asia (MSRA), Research intern, Advised by Dr. Chunyu Wang and Prof. Wenjun Zeng.

Education

Received PhD degree from the School of Computer Science, University of Sydney, advised by Prof. Dacheng Tao and co-supervised by Prof. Baosheng Yu. Obtained Bachelor's degree in the Department of Electronic Engineering and Information Science from the University of Science and Technology of China (USTC).

Background

Currently working at Meituan as a Researcher. Research interests include multi-modality learning, with a particular focus on mllm post-training, unified multimodal understanding and generation, and multimodal reasoning model.

Miscellany