Scholar
Luo Mai
Google Scholar ID: I6GYccIAAAAJ
Associate Professor at University of Edinburgh
Computer Systems
Machine Learning
Data Management
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
1,050
H-index
19
i10-index
22
Publications
20
Co-authors
67
list available
Contact
No contact links provided.
Publications
7 items
TokenScale: Timely and Accurate Autoscaling for Disaggregated LLM Serving with Token Velocity
2025
Cited
0
RAGBoost: Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse
2025
Cited
0
HybridServe: Efficient Serving of Large AI Models with Confidence-Based Cascade Routing
2025
Cited
0
MoE-Gen: High-Throughput MoE Inference on a Single GPU with Module-Based Batching
2025
Cited
0
WaferLLM: A Wafer-Scale LLM Inference System
2025
Cited
0
MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems
2024
Cited
0
MoE-Infinity: Efficient MoE Inference on Personal Machines with Sparsity-Aware Expert Cache
2024
Cited
6
Resume (English only)
Co-authors
67 total
Peter Pietzuch
Professor of Distributed Systems, Imperial College London
Paolo Costa
Microsoft Research
Yao Fu
Ph.D. student, The University of Edinburgh
Leyang Xue
University of Edinburgh
Yeqi Huang
University of Edinburgh
Hao Dong
Tenured Associate Professor at Peking University
Co-author 7
Co-author 8
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up