Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
Optimizing Speculative Decoding for Serving Large Language Models Using Goodput, arxiv, 2024.
Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity, arxiv, 2024.
Online Speculative Decoding, ICML 2024.
Leveraging Application Data Constraints to Optimize Database-Backed Web Applications, VLDB 2023.
GACT: Activation Compressed Training for Generic Network Architectures, ICML 2022.
Order-Preserving Key Compression for In-Memory Search Trees, SIGMOD 2020.
Research Experience
Before joining OpenAI, she was a CS PhD student at UC Berkeley, involved in various research projects including being part of the vllm team.
Education
PhD: University of California, Berkeley, advised by Professors Alvin Cheung and Ion Stoica; Master's: Carnegie Mellon University, worked with Andy and Huanchen; Undergraduate: Peking University.
Background
Currently a member of the technical staff at OpenAI. Previously, a CS PhD student from UC Berkeley, affiliated with Sky Lab (formerly known as RISE/AMP Lab). Broadly interested in building efficient machine learning systems.
Miscellany
Occasionally writes blogs on topics such as online speculative decoding.