Scholar

Ziheng Jiang

Google Scholar ID: tuRCeekAAAAJ

Research Scientist, ByteDance

SystemsMachine Learning

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

4,400

H-index

i10-index

Publications

Co-authors

Contact

CVOpen ↗TwitterOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

12 items

DisagMoE: Computation-Communication overlapped MoE Training via Disaggregated AF-Pipe Parallelism

2026

Cited

SRT: Accelerating Reinforcement Learning via Speculative Rollout with Tree-Structured Cache

2026

Cited

Mesh-Attention: A New Communication-Efficient Distributed Attention with Improved Data Locality

2025

Cited

SwiftSpec: Ultra-Low Latency LLM Decoding by Scaling Asynchronous Speculative Decoding

2025

Cited

MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production

2025

Cited

Understanding Stragglers in Large Model Training Using What-if Analysis

2025

Cited

Triton-distributed: Programming Overlapping Kernels on Distributed AI Systems with the Triton Compiler

2025

Cited

Seed-Thinking-v1.5: Advancing Superb Reasoning Models with Reinforcement Learning

2025

Cited

Resume (English only)

Academic Achievements

Ziheng's work has been cited more than 4000 times. Some selected papers include:
- MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs (2024)
- MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism (2025)
- Understanding Stragglers in Large Model Training Using What-if Analysis (2025)
- Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts (2025)
- TileLink: Generating Efficient Compute-Communication Overlapping Kernels using Tile-Centric Primitives (2025)
- FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion (Preprint, 2024)
- Characterizing Structural Regularities of Labeled Data in Overparameterized Models (ICML, 2021)
- Learning to optimize tensor programs (NIPS, 2018)
- TVM: An Automated End-to-End Optimizing Compiler for Deep Learning (OSDI, 2018)
- Efficient Deep Learning Inference on Edge Devices (MLSys, 2018)

Research Experience

Heavily involved in projects such as MegaScale, Apache TVM, and Apache MXNet.

Education

Ph.D. from the Paul G. Allen School of Computer Science & Engineering at the University of Washington, advised by Luis Ceze and Tianqi Chen; Bachelor’s degree from Fudan University, where he was a member of Fudan NLP Lab, working with Xipeng Qiu and Zheng Zhang.

Background

Ziheng works on large language model (LLM) systems at ByteDance, focusing on scaling and optimizing LLM training and inference. His research interests include Machine Learning Systems and Large Language Models.

Co-authors

0 total

Co-authors: 0 (list not available)