Scholar

Yuzi Yan

Google Scholar ID: FBoRIz8AAAAJ

Tsinghua University, Moonshot AI

Robustness in RLLLMmLLMRobotics

Citations & Impact

All-time

Citations

1,261

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

4 items

2026

Cited

2025

Cited

arXiv.org · 2024

Cited

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

Publications at top venues: ICLR 2025, ICML 2024, IEEE TSP, ICASSP 2023 (oral), ICRA 2022, INTERSPEECH 2021, ICASSP 2021
Preprints: 'Reward-Robust RLHF in LLMs', 'Uncertainty-aware Reward Model', etc.
Champion of Tencent AI Arena Multi-agent Reinforcement Learning Competition (2022, 2023)
First Prize (2nd place), ICRA RoboMaster University Sim2Real Challenge by DJI (2022)
3rd place, World University Math & Intelligence Competition (Chengdu FISU World University Games)

Research Experience

Feb 2025–present: Research Intern at Moonshot AI, working on general RL for multimodal LLMs and developing Kimi K-series models (e.g., Kimi-K2, Kimi-Dev-72B, Kimi-VL)
Aug 2023–Jan 2025: Research Intern in RLHF group at Baichuan AI, mentored by Dong Yan
Aug 2020–mid 2023: Research Intern at Machine Learning Group, Microsoft Research Asia (MSRA), mentored by Xu Tan, Tao Qin, and Tieyan Liu
Oct 2024–Mar 2025: 6-month visiting researcher at UIUC, hosted by Tamer Basar
Collaborating with Yu Wang on the Collaborative Intelligence Group
Working with Kaiqing Zhang and Tamer Basar on theoretical foundations of RL/MARL

Co-authors

6 total