Notable works include 'Reasoning with Exploration: An Entropy Perspective' (AAAI 2026), FlowRL, and STILL.
Research Experience
Since 2021, a research student in the GenAI Group of Microsoft Research, advised by Shaohan Huang and Furu Wei; previously a research assistant in the CoAI Group, Tsinghua University, advised by Yuxian Gu and Minlie Huang; also worked as a research engineer at BIGAI, collaborating with Xuekai Zhu.
Education
Ph.D. student at Gaoling School of AI, Renmin University of China, advised by Xin Zhao.
Background
Research Interests: Reinforcement Learning for LLM Reasoning, especially Exploration Mechanisms. Professional Field: Artificial Intelligence.