Ziming Mao
Scholar

Ziming Mao

Google Scholar ID: ycaUmLkAAAAJ
UC Berkeley
Distributed SystemsBig DataAI Systems
Citations & Impact
All-time
Citations
665
 
H-index
10
 
i10-index
10
 
Publications
18
 
Co-authors
18
list available
Resume (English only)
Academic Achievements
  • Paper 'SkyLB: A Locality-Aware Cross-Region Load Balancer for LLM Inference' published at EuroSys 2026.
  • Paper 'Rethinking Cost of Distributed Caches for Datacenter Services' published at HotNets 2025.
  • Paper 'Spirit: Fairness for Interdependent Cache and Bandwidth Resources' published at SOSP 2025.
  • Paper 'SkyServe: Serving AI Models across Regions and Clouds with Spot Instances' published at EuroSys 2025.
  • Paper 'LEANN: A Low-Storage Vector Index' published at ICML 2025 VecDB Workshop.
  • Paper 'An Extensible Software Transport Layer for GPU Networking' published at ArXiv 2025.
  • Paper 'Trinity: A Fast Compressed Multi-attribute Data Store' published at EuroSys 2024, Best Student Paper Award.
  • Paper 'Can’t Be Late: Optimizing Spot Instance Savings under Deadlines' published at NSDI 2024, Outstanding Paper Award.
  • Paper 'Revisiting Cache Freshness for Emerging Real-Time Applications' published at HotNets 2024.
  • Paper 'Locality-aware Fair Scheduling in LLM Serving' published at NSDI 2024.
Research Experience
  • Worked on UCCL, a fast and extensible GPU communication library that supports heterogeneous GPU and networking vendors.
  • Worked on Clink, a consistent distributed caching architecture for high-volume search and AI serving.
  • Worked on SkyServe, serving AI models across regions and clouds over Spot and On-Demand GPUs.
  • Worked on SkyPilot, a framework for running LLMs, AI, and batch jobs on any cloud.
  • Worked on Ray Data, a framework for efficient execution of ML training and inference pipelines over heterogeneous resources.
  • Earlier project Trinity, a distributed data store that achieves both fast multi-attribute queries and storage efficiency.
  • Worked on NLP in Yale LILY, on retrieval-based generation and table-based question answering.
Background
  • Research interests include systems and networking, particularly in the context of cloud, data, and AI. Currently a Ph.D. student at UC Berkeley, advised by Prof. Ion Stoica and Prof. Scott Shenker. Also working with Prof. Rishabh Iyer. Completed undergrad from Yale University, double majoring in Computer Science and Philosophy.
Miscellany
  • Personal interests include astrophotography.