Scholar
Daoan Zhang
Google Scholar ID: yeLdvGoAAAAJ
PhD Student, University of Rochester
Computer Vision
Multimodal Learning
LLM
Follow
Google Scholar
↗
Citations & Impact
All-time
Citations
723
H-index
12
i10-index
14
Publications
20
Co-authors
7
list available
Contact
No contact links provided.
Publications
14 items
A Versatile Multimodal Agent for Multimedia Content Generation
arXiv.org · 2026
Cited
0
Sphinx: Benchmarking and Modeling for LLM-Driven Pull Request Review
arXiv.org · 2026
Cited
0
JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation
2025
Cited
0
VisualActBench: Can VLMs See and Act like a Human?
2025
Cited
0
LAST: LeArning to Think in Space and Time for Generalist Vision-Language Models
2025
Cited
0
UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist
2025
Cited
0
MaPPO: Maximum a Posteriori Preference Optimization with Prior Knowledge
2025
Cited
0
On Path to Multimodal Generalist: General-Level and General-Bench
2025
Cited
0
Load more
Resume (English only)
Co-authors
7 total
Jianguo Zhang
Professor, Southern University of Science and Technology
Co-author 2
Hanjia Lyu
University of Rochester
Jianhua Yao
Tencent, IEEE Fellow, AIMBE Fellow
Co-author 5
Wenjian Huang
Peking University
Hongwei Bran Li
Martinos Center, MGH, Harvard Medical School
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up