Scholar

Yufei Zhan

Google Scholar ID: RvHqTGEAAAAJ

Institute of Automation, Chinese Academy of Science

Computer VisionLarge Multimodal ModelsGrounding and Detection

Google Scholar↗

Citations & Impact

All-time

Citations

124

H-index

4

i10-index

4

Publications

10

Co-authors

5

list available

Contact

No contact links provided.

Publications

11 items

TraceVision: Trajectory-Aware Vision-Language Model for Human-Like Spatial Understanding

2026

Cited

0

Baseline Method of the Foundation Model Challenge for Ultrasound Image Analysis

2026

Cited

0

GeM-VG: Towards Generalized Multi-image Visual Grounding with Multimodal Large Language Models

arXiv.org · 2026

Cited

0

Unleashing Perception-Time Scaling to Multimodal Reasoning Models

2025

Cited

0

Seeing is Believing? Mitigating OCR Hallucinations in Multimodal Large Language Models

2025

Cited

0

FOCUS: Unified Vision-Language Modeling for Interactive Editing Driven by Referential Segmentation

2025

Cited

0

VFaith: Do Large Multimodal Models Really Reason on Seen Images Rather than Previous Memories?

2025

Cited

0

GThinker: Towards General Multimodal Reasoning via Cue-Guided Rethinking

2025

Cited

0

Resume (English only)

Co-authors

5 total

Jinqiao Wang 王金桥

Professor, Institute of Automation,Chinese Academy of Science

Associate Professor, Chinese Academy of Sciences, Institute of Automation

Westlake University

Institute of Automation, Chinese Academy of Sciences

National Laboratory of Pattern Recognition，Institute of Automation，Chinese Academy of Sciences