Wenzhe Cai
Scholar

Wenzhe Cai

Google Scholar ID: NHQcCyAAAAAJ
Shanghai AI Laboratory
Reinforcement LearningVisual NavigationRobotics
Citations & Impact
All-time
Citations
397
 
H-index
9
 
i10-index
8
 
Publications
20
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • InternVLA-N1: The first open dual-system vision-language navigation foundation model.
  • InternScenes: A large-scale interactive indoor scene dataset comprising approximately 40,000 diverse scenes.
  • NavDP: Learning sim-to-real navigation diffusion policy with privileged information guidance.
  • StreamVLN: A streaming VLN framework that employs a hybrid slow-fast context modeling strategy to support multi-modal reasoning over interleaved vision, language, and action inputs.
  • ImagineNav: A novel navigation decision framework using imagination to generate candidate future images and let VLMs select.
  • Boosting Efficient Reinforcement Learning for Vision-and-Language Navigation with Open-Sourced LLM: A hierarchical reinforcement learning method using efficient open-sourced LLMs as a high-level planner and an RL-based policy for sub-instruction accomplishment.
  • InstructNav: A zero-shot system for generic instruction navigation in unexplored environments.
  • MO-DDN: A coarse-to-fine attribute-based exploration agent for multi-object demand-driven navigation.
Research Experience
  • Researcher at Shanghai AI Laboratory, working closely with Dr. Tai Wang and Dr. Jiangmiao Pang.
Education
  • Ph.D. from Southeast University, advised by Prof. Changyin Sun; Visiting student at Peking University, advised by Prof. Hao Dong.
Background
  • Research Interests: Embodied AI, especially on building intelligent robots that can comprehend diverse language instructions and exhibit adaptive navigation behaviors in the dynamic open world. Specializations: Embodied AI, Visual Navigation, and Deep Reinforcement Learning.
Miscellany
  • Contact: Email / Google Scholar / Github
Co-authors
0 total
Co-authors: 0 (list not available)