Cofirst-author paper ADG on Robust RL accepted by NeurIPS 2025; First-author paper SDQC on Safe RL accepted by ICML 2025; First-author paper OPA-DPO on RL4VLMs accepted by CVPR 2025 for Oral Presentation; First-author paper DMBP on Robust RL accepted by ICLR 2024; Honored with the “Project Up” Talent Program Award from Tencent, the “Stars of Tomorrow” Award of Excellence from MSRA (Top 10% research intern), CUHK Postgraduate Scholarship, Singapore Professional Engineers Board Gold Medal (Best Graduate in NUS-ME, sole recipient), Outstanding Graduate from Zhejiang University (Top 10% undergraduate students), Zhejiang Provincial Government Scholarship.
Research Experience
Research Intern at Tencent-Hunyuan (2025.7 - Now): Research on RL for Multimodal Model (video/audio/image) dense caption; Research Intern at MSRA (2024.6 - 2025.2): Research on RLH(AI)F algorithms for (Multimodal) Large Language Models; Engineer Intern at FESTO (2020.6 - 2020.8): Designing industrial cylinder structure.
Education
PhD: The Chinese University of Hong Kong (2022.8 - 2026.6 expected), Mechanical and Automation Engineering, supervised by Prof. Yunjian Xu; M.S.: National University of Singapore (2020.9 - 2022.7), Mechanical Engineering, supervised by Prof. Wentao Yan; B.E.: Zhejiang University (2017.8 - 2021.6), Mechanical Engineering.
Background
Research interests focus on developing trustworthy RL algorithms for robotic control as well as exploring large language models (LLMs), large vision-language models (LVLMs), and image/video generation models.
Miscellany
Expected to graduate in June 2026 and actively seeking full-time positions in the industry.