Scholar

Chenming Zhu

Google Scholar ID: QabwS_wAAAAJ

The University of Hong Kong

Multimodal Large Language Model3D Vision

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

1,003

H-index

9

i10-index

9

Publications

14

Co-authors

9

list available

Contact

No contact links provided.

Publications

10 items

G$^2$TAM: Geometry Grounded Track Anything Model

2026

Cited

0

Thinking with Imagination: Agentic Visual Spatial Reasoning with World Simulators

2026

Cited

0

MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence

2025

Cited

0

Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation

2025

Cited

0

G$^2$VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

2025

Cited

0

OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding

2025

Cited

0

StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling

2025

Cited

0

MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence

2025

Cited

0

Resume (English only)

Co-authors

9 total

Shanghai AI Laboratory

Jiangmiao Pang (庞江淼)

Shanghai AI Laboratory

University of Hong Kong, UC Berkeley, CUHK, Tsinghua University

The Chinese University of Hong Kong

Shanghai AI Laboratory

Shanghai AI Laboratory

Assistant Professor, The Chinese University of Hong Kong, Shenzhen

the Chinese University of Hong Kong