Scholar
Kan Zhu
Google Scholar ID: wkTiqicAAAAJ
University of Washington
Machine learning system
Architecture
Follow
Google Scholar
↗
Citations & Impact
All-time
Citations
473
H-index
5
i10-index
4
Publications
11
Co-authors
10
list available
Contact
No contact links provided.
Publications
6 items
Accelerating Large-Scale Reasoning Model Inference with Sparse Self-Speculative Decoding
2025
Cited
0
PolyServe: Efficient Multi-SLO Serving at Scale
2025
Cited
0
TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval
2025
Cited
0
Tactic: Adaptive Sparse Attention with Clustering and Distribution Fitting for Long-Context LLMs
2025
Cited
0
NanoFlow: Towards Optimal Large Language Model Serving Throughput
arXiv.org · 2024
Cited
31
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
arXiv.org · 2024
Cited
10
Resume (English only)
Co-authors
10 total
Baris Kasikci
University of Washington
Yilong Zhao
Ph.D. student, UC Berkeley
Chien-Yu Lin
PhD Student, University of Washington
Arvind Krishnamurthy
Short-Dooley Professor, Univ. of Washington
Zihao Ye
NVIDIA, University of Washington
Keisuke Kamahori
University of Washington
Lequn Chen
University of Washington
Size Zheng
ByteDance Seed
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up