Published paper 'Defeating the Training-Inference Mismatch via FP16' (Preprint. 2025); Involved in project GEM: A Gym for Agentic LLMs (Preprint. 2025).
Research Experience
Xiaohongshu Hi Lab, RedStar Intern, Aug. 2025 - Present; Sea AI Lab, Associate Member, July. 2025 - Aug. 2025; ByteDance Seed, Research Intern, May. 2025 - Jul. 2025; ByteDance AI Lab, Research Intern, May. 2023 - May. 2025; ByteDance AML, Research Intern, Sep. 2022 - May. 2023.
Education
Institute of Automation, Chinese Academy of Sciences, University of Chinese Academy of Sciences, School of Artificial Intelligence, Ph.D. Student, Sep. 2021 - present; Tsinghua University, B.Eng. in Electronic Engineering, Sep. 2016 - Jul. 2021. Advisor: Liang Wang.
Background
Research focuses on reinforcement learning to enhance large language models (LLMs), improving their reasoning abilities and making their responses more accurate, reliable, trustworthy, and interpretable. Also studies long-term memory for LLMs, aiming to enable LLMs to personalize their behavior through continual interactions with users. Additionally, works on AI for Drug Discovery (AIDD), developing advanced generative models and algorithms for designing small molecules and proteins.