Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction (CVPR, 2025)
Generating Enhanced Negatives for Training Language-Based Object Detectors (CVPR, 2024)
Taming Self-Training for Open-Vocabulary Object Detection (CVPR, 2024)
Exploiting Unlabeled Data with Vision and Language Models for Object Detection (ECCV, 2022)
Global Matching with Overlapping Attention for Optical Flow Estimation (CVPR, 2022)
Deep Animation Video Interpolation in the Wild (CVPR, 2021)
RefineDNet: A Weakly Supervised Refinement Framework for Single Image Dehazing (Transactions on Image Processing, 2021)
Dehazing Evaluation: Real-world Benchmark Datasets, New Criteria and Baselines (Transactions on Image Processing, 2020)
A CNN-based Depth Estimation Approach with Multi-scale Sub-pixel Convolutions and A Smoothness Constraint (ACCV, 2018)
Research Experience
Research intern/student researcher at Meta/Google/NEC Labs/SenseTimes working on Generative Models, Multi-modality Models, LLMs, Vision-and-Language Models, and Image Understanding.
Education
Received B.S. in 2017 and M.S. in 2020 from the School of Software Engineering at Tongji University, supervised by Prof. Lin Zhang; currently pursuing a PhD degree in the Department of Computer Science at Rutgers University, supervised by Prof. Dimitris Metaxas.
Background
Research interest: Solving computer vision problems with large foundation models. Research intern/student researcher at Meta/Google/NEC Labs/SenseTimes on Generative Models/Multi-modality Models/LLMs/Vision-and-Language Models/Image Understanding.
Miscellany
Actively looking for a full-time position in Industry.