Scholar

Yifan Du

Google Scholar ID: YJf-45cAAAAJ

Renmin University of China

Vision Language ModelMLLM

Citations & Impact

All-time

Citations

8,451

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

12 items

2026

Cited

2026

Cited

2026

Cited

2026

Cited

2026

Cited

2025

Cited

2025

Cited

2025

Cited

Resume (English only)

Academic Achievements

Published 'Virgo: A Preliminary Exploration on Reproducing o1-like MLLM'
Published 'Exploring the Design Space of Visual Context Representation in Video MLLMs' (arXiv)
Published 'Towards Event-oriented Long Video Understanding' (arXiv)
Published 'What Makes for Good Visual Instructions? Synthesizing Complex Visual Reasoning Instructions for Visual Instruction Tuning' (COLING 2025)
Published 'Evaluating Object Hallucination in Large Vision-Language Models' (EMNLP 2023)
Co-authored survey 'A Survey of Large Language Models'
Published 'Zero-shot Visual Question Answering with Language Model Feedback' (ACL 2023 Findings)
Published 'Learning to Imagine: Visually-Augmented Natural Language Generation' (ACL 2023)
Published survey 'A Survey of Vision-Language Pre-Trained Models' (IJCAI 2022)
Open-source project: Virgo (an MLLM with slow-thinking reasoning ability)
Open-source project: POPE (a benchmark for evaluating object hallucination in MLLMs)

Co-authors

7 total