Haoyu Lu
Scholar

Haoyu Lu

Google Scholar ID: LRxi6-UAAAAJ
Renmin University of China | Moonshot AI
multimodal foundation modelvideo-language modeling
Citations & Impact
All-time
Citations
2,883
 
H-index
15
 
i10-index
16
 
Publications
20
 
Co-authors
10
list available
Resume (English only)
Academic Achievements
  • 1. DeepSeek-VL: Towards Real-World Vision-Language Understanding
  • 2. WenLan (悟道文澜): Bridging vision and language by large-scale multi-modal pre-training
  • 3. VDT: General-purpose Video Diffusion Transformers via Mask Modeling
  • 4. Uniadapter: Unified parameter-efficient transfer learning for cross-modal modeling
  • 5. LGDN: Language-Guided Denoising Network for Video-Language Modeling
  • 6. Bmu-moco: Bidirectional momentum update for continual video-language modeling
  • 7. Towards artificial general intelligence via a multimodal foundation model
  • 8. COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval
  • 9. Learning versatile neural architectures by propagating network codes
  • 10. Compressed video contrastive learning
Research Experience
  • Works closely with Dr. Mingyu Ding at UC Berkeley and Prof. Bo Zhang at ZJU on research projects.
Education
  • Received B.E. degree in Computer Science from Renmin University of China in 2021; currently pursuing a Ph.D. at Renmin University of China, advised by Prof. Zhiwu Lu.
Background
  • Research interests: multimodal foundation model and video understanding. Currently a Ph.D. Student at Renmin University of China, advised by Prof. Zhiwu Lu.