Scholar
Yufei Zhan
Google Scholar ID: RvHqTGEAAAAJ
Institute of Automation, Chinese Academy of Science
Computer Vision
Large Multimodal Models
Grounding and Detection
Follow
Google Scholar
↗
Citations & Impact
All-time
Citations
124
H-index
4
i10-index
4
Publications
10
Co-authors
5
list available
Contact
No contact links provided.
Publications
11 items
TraceVision: Trajectory-Aware Vision-Language Model for Human-Like Spatial Understanding
2026
Cited
0
Baseline Method of the Foundation Model Challenge for Ultrasound Image Analysis
2026
Cited
0
GeM-VG: Towards Generalized Multi-image Visual Grounding with Multimodal Large Language Models
arXiv.org · 2026
Cited
0
Unleashing Perception-Time Scaling to Multimodal Reasoning Models
2025
Cited
0
Seeing is Believing? Mitigating OCR Hallucinations in Multimodal Large Language Models
2025
Cited
0
FOCUS: Unified Vision-Language Modeling for Interactive Editing Driven by Referential Segmentation
2025
Cited
0
VFaith: Do Large Multimodal Models Really Reason on Seen Images Rather than Previous Memories?
2025
Cited
0
GThinker: Towards General Multimodal Reasoning via Cue-Guided Rethinking
2025
Cited
0
Load more
Resume (English only)
Co-authors
5 total
Jinqiao Wang 王金桥
Professor, Institute of Automation,Chinese Academy of Science
Yousong Zhu
Associate Professor, Chinese Academy of Sciences, Institute of Automation
Zhiyang Chen
Westlake University
Chaoyang Zhao
Institute of Automation, Chinese Academy of Sciences
Co-author 5
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up