Dongming Wu
Scholar

Dongming Wu

Google Scholar ID: ejFCAq0AAAAJ
MMLab, CUHK; CPII
Computer VisionVision and LanguageMLLMEmbodied AI
Citations & Impact
All-time
Citations
1,071
 
H-index
12
 
i10-index
12
 
Publications
17
 
Co-authors
7
list available
Resume (English only)
Academic Achievements
  • Publications: ICCV 2025 - RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark; CVPR 2025 - DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation; AAAI 2025 - Language prompt for autonomous driving; ECCV 2024 - Merlin: Empowering Multimodal LLMs; ICLR 2024 - TopoMLP; Preprint Papers: Grounding Beyond Detection: Enhancing Contextual Understanding in Embodied 3D Grounding; Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving?; Bootstrapping Referring Multi-Object Tracking. Awards: 2025.06 Outstanding Graduates of Beijing; 2024.05 Excellent Doctoral Thesis Seedling Fund.
Research Experience
  • Dexmal Research Intern, Mentor: Yingfei Liu and Tiancai Wang; MBZUAI Visiting Student, Mentor: Prof. Rao Muhammad Anwer and Prof. Fahad Shahbaz Khan; MEGVII Research Intern, Mentor: Tiancai Wang and Xiangyu Zhang; IIAI Research Intern, Mentor: Xingping Dong and Prof. Ling Shao.
Education
  • 2025.06: Ph.D. in Department of Computer Science, Beijing Institute of Technology, advised by Prof. Jianbing Shen; 2019.06: Bachelor degree from the Class of Xu at the same university.
Background
  • Research interests include vision-language learning, multimodal large language models (MLLMs), and embodied agents. During graduate studies, focused on building intelligent perception models that understand visual and linguistic information. Recently, exploring decision-making systems capable of actively interacting with both humans and dynamic environments. The ultimate goal is to develop human-like agents that can perceive real-world environments and make autonomous decisions.
Miscellany
  • Open to collaboration and discussions about the latest advancements in the field.