Scholar

Shiyi Cao

Google Scholar ID: NR59_v4AAAAJ

UC Berkeley

Large Language ModelsMachine Learning SystemsDistributed SystemsHPC

Citations & Impact

All-time

Citations

1,106

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

9 items

2026

Cited

2025

Cited

2025

Cited

2025

Cited

2025

Cited

2025

Cited

2025

Cited

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

MoE-Lightning: High-Throughput MoE Inference on Memory-constrained GPUs (ASPLOS 2025)
GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism (ASPLOS 2025)
Fairness in Serving Large Language Models (OSDI 2024)
S-LoRA: Serving Thousands of Concurrent LoRA Adapters (MLSys 2024)
Accelerating Data Serialization/Deserialization Protocols with In-Network Compute (ExaMPI@SC 2022)
AdaM: An Adaptive Fine-Grained Scheme for Distributed Metadata Management (ICPP 2019)

Background

Mainly interested in accelerating/optimizing computations (especially ML workloads) on large-scale heterogeneous systems. Currently looking for research interns interested in efficient RL training systems or agent training.

Miscellany