Three papers accepted by NeurIPS 2024; Depth-µP and Skill-Mix accepted by ICLR 2024; released Skill-Mix, a new type of LLMs' capability evaluation on combining basic skills; Tensor Program VI introduces Depth-µP, enabling networks to scale up to infinite depth.
Research Experience
Previously worked as a researcher at Microsoft, contributing to the development of Phi-4; has accumulated extensive research experience before joining OpenAI.
Education
Ph.D. in Computer Science from Princeton University, supervised by Sanjeev Arora; Yao Class student studying Computer Science at the Institute for Interdisciplinary Information Science, Tsinghua University.
Background
A researcher at OpenAI, focusing on large language model research and development, particularly in synthetic data generation. Interests lie at the intersection of deep learning theory and practice, including the optimization and evaluation of large models.
Miscellany
Participated in algorithmic programming competitions during high school and college, winning gold medals at IOI and ACM-ICPC World Finals.