Yuzi Yan
Scholar

Yuzi Yan

Google Scholar ID: FBoRIz8AAAAJ
Tsinghua University, Moonshot AI
Robustness in RLLLMmLLMRobotics
Citations & Impact
All-time
Citations
1,261
 
H-index
10
 
i10-index
10
 
Publications
15
 
Co-authors
6
list available
Resume (English only)
Academic Achievements
  • Publications at top venues: ICLR 2025, ICML 2024, IEEE TSP, ICASSP 2023 (oral), ICRA 2022, INTERSPEECH 2021, ICASSP 2021
  • Preprints: 'Reward-Robust RLHF in LLMs', 'Uncertainty-aware Reward Model', etc.
  • Champion of Tencent AI Arena Multi-agent Reinforcement Learning Competition (2022, 2023)
  • First Prize (2nd place), ICRA RoboMaster University Sim2Real Challenge by DJI (2022)
  • 3rd place, World University Math & Intelligence Competition (Chengdu FISU World University Games)
Research Experience
  • Feb 2025–present: Research Intern at Moonshot AI, working on general RL for multimodal LLMs and developing Kimi K-series models (e.g., Kimi-K2, Kimi-Dev-72B, Kimi-VL)
  • Aug 2023–Jan 2025: Research Intern in RLHF group at Baichuan AI, mentored by Dong Yan
  • Aug 2020–mid 2023: Research Intern at Machine Learning Group, Microsoft Research Asia (MSRA), mentored by Xu Tan, Tao Qin, and Tieyan Liu
  • Oct 2024–Mar 2025: 6-month visiting researcher at UIUC, hosted by Tamer Basar
  • Collaborating with Yu Wang on the Collaborative Intelligence Group
  • Working with Kaiqing Zhang and Tamer Basar on theoretical foundations of RL/MARL