Published multiple papers including 'Segment Anything in High Quality' (NeurIPS 2023), 'Gaussian Grouping: Segment and Edit Anything in 3D Scenes' (ECCV 2024), and 'Matching Anything By Segmenting Anything' (CVPR 2024); involved in projects such as Gaussian Grouping, HQ-SAM, and SAM-PT; presented research at international conferences like ICCV, CVPR, and ECCV; served as an Area Chair for ICLR 2026 and NeurIPS 2025; organized a workshop at ICCV 2025.
Research Experience
Currently a Senior Researcher at Tencent AI Seattle; was a Postdoctoral Research Associate at Carnegie Mellon University, working with Prof. Katerina Fragkiadaki; served as a visiting PhD student at the Computer Vision Lab, ETH Zurich, supervised by Prof. Fisher Yu and Dr. Martin Danelljan.
Education
Obtained his Ph.D. degree from the CSE Department at HKUST in mid-2023, supervised by Chi-Keung Tang and Yu-Wing Tai. During his PhD, he also spent two years as a visiting scholar at ETH Zurich. He received his B.E. degree from the School of Computer Science at Wuhan University.
Background
Senior Research Scientist at Tencent AI, Seattle Lab. His primary research interest lies in building multimodal foundation systems, especially visual understanding, reasoning, and generation. Previously, he worked as a Postdoctoral Research Associate at Carnegie Mellon University's Computer Science and in the Computer Vision Lab of ETH Zurich.
Miscellany
His open-source projects have over 10K+ GitHub stars; gave a guest lecture on Vision Foundation Model at Texas A&M University; delivered talks on Scene Understanding with Vision Foundation Models at Stanford SVL and MARVL.