Scholar

Haoyu Lu

Google Scholar ID: LRxi6-UAAAAJ

Renmin University of China | Moonshot AI

multimodal foundation modelvideo-language modeling

Citations & Impact

All-time

Citations

2,883

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

3 items

2026

Cited

2026

Cited

arXiv.org · 2026

Cited

Resume (English only)

Academic Achievements

1. DeepSeek-VL: Towards Real-World Vision-Language Understanding
2. WenLan (悟道文澜): Bridging vision and language by large-scale multi-modal pre-training
3. VDT: General-purpose Video Diffusion Transformers via Mask Modeling
4. Uniadapter: Unified parameter-efficient transfer learning for cross-modal modeling
5. LGDN: Language-Guided Denoising Network for Video-Language Modeling
6. Bmu-moco: Bidirectional momentum update for continual video-language modeling
7. Towards artificial general intelligence via a multimodal foundation model
8. COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval
9. Learning versatile neural architectures by propagating network codes
10. Compressed video contrastive learning

Research Experience

Works closely with Dr. Mingyu Ding at UC Berkeley and Prof. Bo Zhang at ZJU on research projects.

Education

Received B.E. degree in Computer Science from Renmin University of China in 2021; currently pursuing a Ph.D. at Renmin University of China, advised by Prof. Zhiwu Lu.

Background

Research interests: multimodal foundation model and video understanding. Currently a Ph.D. Student at Renmin University of China, advised by Prof. Zhiwu Lu.

Co-authors

10 total