Scholar

Shiyu Zhao (赵世雨)

Google Scholar ID: M-qzlU8AAAAJ

PhD student of Computer Scicence, Rutgers University

Computer VisionImage ProcessingMachine Learning

Citations & Impact

All-time

Citations

1,218

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

20 items

Browse publications on Google Scholar (top-right) ↗

Resume (English only)

Academic Achievements

Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction (CVPR, 2025)
Generating Enhanced Negatives for Training Language-Based Object Detectors (CVPR, 2024)
Taming Self-Training for Open-Vocabulary Object Detection (CVPR, 2024)
Exploiting Unlabeled Data with Vision and Language Models for Object Detection (ECCV, 2022)
Global Matching with Overlapping Attention for Optical Flow Estimation (CVPR, 2022)
Deep Animation Video Interpolation in the Wild (CVPR, 2021)
RefineDNet: A Weakly Supervised Refinement Framework for Single Image Dehazing (Transactions on Image Processing, 2021)
Dehazing Evaluation: Real-world Benchmark Datasets, New Criteria and Baselines (Transactions on Image Processing, 2020)
A CNN-based Depth Estimation Approach with Multi-scale Sub-pixel Convolutions and A Smoothness Constraint (ACCV, 2018)

Research Experience

Research intern/student researcher at Meta/Google/NEC Labs/SenseTimes working on Generative Models, Multi-modality Models, LLMs, Vision-and-Language Models, and Image Understanding.

Education

Received B.S. in 2017 and M.S. in 2020 from the School of Software Engineering at Tongji University, supervised by Prof. Lin Zhang; currently pursuing a PhD degree in the Department of Computer Science at Rutgers University, supervised by Prof. Dimitris Metaxas.

Background

Research interest: Solving computer vision problems with large foundation models. Research intern/student researcher at Meta/Google/NEC Labs/SenseTimes on Generative Models/Multi-modality Models/LLMs/Vision-and-Language Models/Image Understanding.

Miscellany