Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
Paper 'RetrievalAttention' accepted by NeurIPS 2025 and awarded Best Paper at the 4th NeurIPS Workshop on Efficient Natural Language and Speech Processing (ENLSP)
Multiple papers accepted by top-tier venues including ISCA, VLDB, IEEE ICDCS, IPDPS 2025, ACM TACO, and IEEE Transactions on Services Computing
Active research contributions in LLM inference, federated learning, GPU scheduling, serverless computing, and unified AI caching
Honors include Microsoft Research Asia StarTrack Scholar (2023), Pujiang Talent Program of Shanghai (2022), Huawei Innovation Pioneer awards (2019H2, 2020H1, 2020H2)
Recipient of HKUST Postgraduate Studentship (2014–2018) and Tsinghua University Fellowship for Oversea Exchange (2013)
Background
Tenure-track Associate Professor and Doctoral Supervisor at the John Hopcroft Center for Computer Science, Shanghai Jiao Tong University (SJTU)
Affiliated with the School of Computer Science and a member of the Emerging Parallel Computing Center (EPCC) at SJTU
Research focuses on optimizing distributed systems for modern applications, especially in machine learning contexts such as Federated Learning, Deep Learning Scheduling, and LLM Inference
Aims to identify fundamental system challenges and develop efficient, user-friendly system solutions
Actively seeking motivated undergraduate, Master's, and PhD students to join the research group