Scholar

Chaoyang Zhu

Google Scholar ID: jkxdiToAAAAJ

CSE, Hong Kong University of Science and Technology

Multimodal Learning & Reasoning

Citations & Impact

All-time

Citations

849

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

3 items

2026

Cited

2025

Cited

2025

Cited

Resume (English only)

Academic Achievements

Publications: ICLR 2025 (Open-vocabulary Detection), MM 2024 (Instruction-tuned SAM), TPAMI 2024 (Survey on Open-Vocabulary Perception), ECCV 2022 (SeqTR, Oral, 200+ citations), ICCV 2021 (TRAR, 150+ citations), TNNLS 2019 (actual 1st author, 400+ citations). Awards: Outstanding PGTA honorable mentions by CSE dept, HKUST; Star of future prize awarded by Youtu Lab, Tencent; Outstanding graduate mentions by HDU.

Research Experience

Intern at Tencent Youtu Lab (Jul. 2021 - Mar. 2022), Shanghai; PhD candidate at The Hong Kong University of Science and Technology (Aug. 2023 onwards).

Education

Ph.D. in Informatics, XMU (2023-present), Advisor: Prof. Rongrong Ji; Master's in Computer Science and Engineering, XMU (2020-2023).

Background

Research Interests: Perception tasks, including 2D/3D/video closed-set/open-vocabulary detection/segmentation, involving MLLMs/Point Cloud/NeRF/3DGS. Fascinated by coding, especially CUDA kernels and C++ extensions.

Miscellany

Blogging: Hosted via Notion, includes beginner-friendly annotations for CUDA codes of 3D Gaussian Splatting, a basic tutorial on 3D Radiance & Semantic Field, tentative math derivation of diffusion models, and an OpenMMLab API framework.

Co-authors

4 total