Scholar
Chenming Zhu
Google Scholar ID: QabwS_wAAAAJ
The University of Hong Kong
Multimodal Large Language Model
3D Vision
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
1,003
H-index
9
i10-index
9
Publications
14
Co-authors
9
list available
Contact
No contact links provided.
Publications
8 items
MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence
2025
Cited
0
Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation
2025
Cited
0
G$^2$VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
2025
Cited
0
OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding
2025
Cited
0
StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling
2025
Cited
0
MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence
2025
Cited
0
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness
arXiv.org · 2024
Cited
3
MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations
Neural Information Processing Systems · 2024
Cited
16
Resume (English only)
Co-authors
9 total
Wenwei Zhang
Shanghai AI Laboratory
Jiangmiao Pang (庞江淼)
Shanghai AI Laboratory
Xihui Liu
University of Hong Kong, UC Berkeley, CUHK, Tsinghua University
Runsen Xu
The Chinese University of Hong Kong
Tai Wang
Shanghai AI Laboratory
Kai Chen
Shanghai AI Laboratory
Xiaoguang Han
Assistant Professor, The Chinese University of Hong Kong, Shenzhen
Xinge Zhu
the Chinese University of Hong Kong
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up