Co-authors
15
list available
Resume (English only)
Academic Achievements
- Published multiple papers as first or co-author in top-tier venues including NSDI, OSDI, SIGCOMM, and TOCS, such as:
- “TokenLake: A Unified Segment-level Prefix Cache Pool for Fine-grained Elastic Long-Context LLM Serving” (Preprint)
- “StreamRL: Scalable, Heterogeneous, and Elastic RL for LLMs with Disaggregated Stream Generation” (Preprint)
- “RAGCache: Efficient Knowledge Caching for Retrieval-Augmented Generation” (TOCS'25, To appear)
- “Fast Distributed Inference Serving for Large Language Models” (NSDI'26, To appear, Equal contribution)
- “DistTrain: Addressing Model and Data Heterogeneity with Disaggregated Training for Multimodal Large Language Models” (SIGCOMM 2025)
- “RLHFuse: Efficient RLHF Training for Large Language Models with Inter- and Intra-Stage Fusion” (NSDI 2025)
- “dLoRA: Dynamically Orchestrating Requests and Adapters for LoRA LLM Serving” (OSDI 2024)
- “Jolteon: Unleashing the Promise of Serverless for Serverless Workflows” (NSDI 2024)
- “Fast Vector Query Processing for Large Datasets Beyond GPU Memory with Reordered Pipelining” (NSDI 2024)
- “Ditto: Efficient Serverless Analytics with Elastic Parallelism” (SIGCOMM 2023)
- “Fast, Approximate Vector Queries on Very Large Unstructured Datasets” (NSDI 2023)
- “Transparent GPU Sharing in Container Clouds for Deep Learning Workloads” (NSDI)