Publications: 'Stepwise guided policy optimization: Coloring your incorrect reasoning in GRPO' (coauthored with Peter Chen, Xiaopeng Li, Ziniu Li, Xi Chen); 'ComPO: Preference alignment via comparison oracles' (coauthored with Peter Chen, Xi Chen, Wotao Yin), accepted to NeurIPS 2025; 'Two-timescale gradient descent ascent algorithms for nonconvex minimax optimization' (coauthored with Chi Jin, Michael I. Jordan), accepted to Journal of Machine Learning Research.
Research Experience
Postdoctoral researcher at the Laboratory for Information & Decision Systems (LIDS) at Massachusetts Institute of Technology from 2023 to 2024.
Education
Ph.D. in Electrical Engineering and Computer Science from UC Berkeley, advised by Professor Michael I. Jordan; Postdoctoral researcher at LIDS, MIT, working with Professor Asuman Ozdaglar; M.S. in Pure Mathematics and Statistics from the University of Cambridge; M.S. in Operations Research from UC Berkeley; B.S. in Mathematics from Nanjing University.
Background
Research interests: optimization and machine learning, game theory, social and economic networks, and optimal transport. Brief introduction: Assistant Professor in the Department of Industrial Engineering and Operations Research (IEOR) at Columbia University.