Agora | Research Hub

Citations & Impact

All-time

Citations

295

H-index

8

i10-index

7

Publications

20

Co-authors

33

list available

Contact

Emailzhenan.fan1@huawei.com GitHubOpen ↗LinkedInOpen ↗

Publications

10 items

MARLaaS: Multi-Tenant Asynchronous Reinforcement Learning as a Service

2026

Cited

0

ReviveMoE: Fast Recovery for Hardware Failures in Large-Scale MoE LLM Inference Deployments

2026

Cited

0

DECKBench: Benchmarking Multi-Agent Frameworks for Academic Slide Generation and Editing

2026

Cited

0

MEPIC: Memory Efficient Position Independent Caching for LLM Serving

2025

Cited

0

ElasticMoE: An Efficient Auto Scaling Method for Mixture-of-Experts Models

2025

Cited

0

ExpertWeave: Efficiently Serving Expert-Specialized Fine-Tuned Adapters at Scale

2025

Cited

0

HyperFlexis: Joint Design of Algorithms and Systems for Multi-SLO Serving and Fast Scaling

2025

Cited

0

Enhancing Learned Knowledge in LoRA Adapters Through Efficient Contrastive Decoding on Ascend NPUs

2025

Cited

0

Resume (English only)

Background

Currently a Staff Research Engineer at the Huawei Vancouver Research Center, focusing on optimizing large language model (LLM) inference and deployment on Huawei’s CloudMatrix SuperPod. Recent work centers on scaling LLM serving for SuperPod-scale infrastructure, addressing challenges when running mixture-of-experts (MoE) models such as DeepSeek, Kimi, and Qwen on hundreds of interconnected NPUs. Previously, contributed to the design and development of OptVerse, Huawei’s in-house large-scale optimization solver, with a focus on algorithmic innovations and system-level integration for real-world optimization tasks in the cloud.

Miscellany