Selected for the 2025 CIPS Doctoral Dissertation Incentive Program
Papers '80/20 Rule' and 'SuperGPQA' accepted to NeurIPS 2025
'ProcessBench' paper awarded the ACL 2025 SAC Award
Recipient of the 2025 WAIC Yunfan Award (Spotlight, 'Rising Star')
Led or contributed to the release of key models and reports: Qwen3 series (e.g., Qwen3-235B, Qwen3-Coder-480B, Qwen3-Thinking), QwQ-32B reasoning model, Yi-Lightning
Released multiple benchmarks and algorithms: SuperGPQA (comprehensive LLM evaluation), ProcessBench (process supervision for math reasoning), GSPO (large-scale RL), WorldPM (world preference modeling), ExPO (efficient alignment), DRO (safety prompt optimization), PriDe (debiasing in multiple-choice QA)
Published papers at top-tier conferences: ACL 2025 (ExPO, ProcessBench, Qwen2.5-Math-PRM), ICML 2024 (DRO), ICLR 2024 Spotlight (PriDe, top 5%)
Background
Researcher at the Qwen Team, Alibaba Group
Dedicated to building scalable, generalist AI systems
Focuses on methodologies that consistently and efficiently improve AI intelligence and task-solving abilities with increased compute and data
Current work centers on advancing reasoning capabilities of Qwen models (e.g., Qwen3, QwQ) and developing large-scale reinforcement learning (RL) approaches
Broad research interests include model architecture, interpretability, safety, and alignment
Previously conducted extensive research on LLMs for social good, particularly emotional support systems