Zhihe Yang
Scholar

Zhihe Yang

Google Scholar ID: Vp0ENWcAAAAJ
The Chinese University of Hong Kong
Offline RLRLHFLLMLMM
Citations & Impact
All-time
Citations
85
 
H-index
4
 
i10-index
2
 
Publications
7
 
Co-authors
7
list available
Resume (English only)
Academic Achievements
  • Cofirst-author paper ADG on Robust RL accepted by NeurIPS 2025; First-author paper SDQC on Safe RL accepted by ICML 2025; First-author paper OPA-DPO on RL4VLMs accepted by CVPR 2025 for Oral Presentation; First-author paper DMBP on Robust RL accepted by ICLR 2024; Honored with the “Project Up” Talent Program Award from Tencent, the “Stars of Tomorrow” Award of Excellence from MSRA (Top 10% research intern), CUHK Postgraduate Scholarship, Singapore Professional Engineers Board Gold Medal (Best Graduate in NUS-ME, sole recipient), Outstanding Graduate from Zhejiang University (Top 10% undergraduate students), Zhejiang Provincial Government Scholarship.
Research Experience
  • Research Intern at Tencent-Hunyuan (2025.7 - Now): Research on RL for Multimodal Model (video/audio/image) dense caption; Research Intern at MSRA (2024.6 - 2025.2): Research on RLH(AI)F algorithms for (Multimodal) Large Language Models; Engineer Intern at FESTO (2020.6 - 2020.8): Designing industrial cylinder structure.
Education
  • PhD: The Chinese University of Hong Kong (2022.8 - 2026.6 expected), Mechanical and Automation Engineering, supervised by Prof. Yunjian Xu; M.S.: National University of Singapore (2020.9 - 2022.7), Mechanical Engineering, supervised by Prof. Wentao Yan; B.E.: Zhejiang University (2017.8 - 2021.6), Mechanical Engineering.
Background
  • Research interests focus on developing trustworthy RL algorithms for robotic control as well as exploring large language models (LLMs), large vision-language models (LVLMs), and image/video generation models.
Miscellany
  • Expected to graduate in June 2026 and actively seeking full-time positions in the industry.