Publications: 1. DORM: Preference Data Weights Optimization for Reward Modeling in LLM Alignment (EMNLP, 2025). 2. Self-Rewarding PPO: Aligning Large Language Models with Demonstrations Only (COLM, 2025). 3. Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training (NAACL, 2025). 4. LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy (NeurIPS ML&C Workshop, 2024). 5. Aligning Large Language Models with Representation Editing: A Control Perspective (NeurIPS, 2024). 6. TPD: Enhancing Student Language Model Reasoning via Principle Discovery and Guidance (COLM, 2024).
Research Experience
Research internships at Google Research, Microsoft Azure AI, and Amazon Stores Foundational AI.
Education
1. Georgia Institute of Technology, Ph.D. Candidate in Machine Learning (Aug. 2019 - Present), Advisor: Prof. Chao Zhang; M.S. in Electrical and Computer Engineering (May 2021). 2. Zhejiang University, B.Eng. in Measurement Control Technology and Instruments (Aug. 2015 - June 2019). 3. Harvard Medical School, Visiting Student Researcher in Neural System Group (Sep. 2018 - May 2019).
Background
Research interest primarily lies in model efficiency and data efficiency of language models. Spent several fantastic research internships at Google Research, Microsoft Azure AI, and Amazon Stores Foundational AI.