Published several papers including 'Learning a Pessimistic Reward Model in RLHF', 'Improving Assembly Code Performance with Large Language Models via Reinforcement Learning', etc. Obtained multiple patents such as 'Thin-film optical parametric oscillators'.
Research Experience
Machine Learning Engineer Intern at Meta, 2025 Summer; Research Intern at Google, 2024 Summer; Applied Scientist Intern at Amazon, 2023 Summer; Applied Scientist Intern at Amazon, 2022 Summer.
Education
PhD in Computer Science at University of Illinois Urbana-Champaign, 2021 - present, Advisor: Prof. Singh Gagandeep; M.S. in Electrical and Computer Engineering at Georgia Institute of Technology, 2019 - 2021, Advisor: Prof. Jacob Abernethy; B.S. in Physics at Peking University, 2015 - 2019, Advisor: Prof. Yun-Feng Xiao.
Background
Research interests include: Reinforcement learning from human feedback (RLHF), offline preference-based reinforcement learning, trustworthy reinforcement learning (such as adversarial attack, provably efficient exploration, provably robust exploration, and verification on deep reinforcement learning), and multi-arm bandit learning theories.