Pengyu Cheng
Scholar

Pengyu Cheng

Google Scholar ID: eeQ_yCkAAAAJ
Alibaba Group
machine learningnatural language processing
Citations & Impact
All-time
Citations
1,479
 
H-index
16
 
i10-index
21
 
Publications
20
 
Co-authors
17
list available
Resume (English only)
Academic Achievements
  • Published multiple papers including 'Self-playing Adversarial Language Game Enhances LLM Reasoning' and 'Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM Games'. Served as an Area Chair for ARR 2025. Multiple papers accepted at NAACL 2025, NeurIPS 2024, EMNLP 2024, ACL 2024, and other international conferences.
Research Experience
  • Currently a researcher at Alibaba Group, leading the Quark Foundation LLM RL Team. Previously worked with the RL & Agent Team at Moonshot AI (Kimi) and the Hunyuan LLM Team at Tencent AI Lab. Conducted research on Bayesian and probabilistic machine learning during graduate school.
Education
  • Received Ph.D. from the Department of Electric and Computer Engineering at Duke University in 2021, advised by Dr. Lawrence Carin. Graduated with B.S. from the Department of Mathematical Sciences at Tsinghua University in 2017.
Background
  • Researcher at Alibaba Group, leading the Quark Foundation LLM RL Team. Focuses on enhancing LLMs’ capacity via RLHF, RLVR, and agentic RL. Previously a member of the RL & Agent Team at Moonshot AI (Kimi) and the Hunyuan LLM Team at Tencent AI Lab. Research interests include LLM Self-play, Alignment (RLHF), Text Generation, and NLP Fairness.
Miscellany
  • No personal interests mentioned.