Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
[{'Paper': 'RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval', 'Authors': 'Di Liu, Meng Chen, Baotong Lu, Huiqiang Jiang, Zhenhua Han, Qianxi Zhang, Qi Chen, Chengruidong Zhang, Bailu Ding, Kai Zhang, Chen Chen, Fan Yang, Yuqing Yang, Lili Qiu', 'Conference': 'NeurIPS Workshop ENLSP-IV (Best Paper Award), NeurIPS 2025'}, {'Paper': 'Minference 1.0: Accelerating pre-filling for long-context llms via dynamic sparse attention', 'Authors': 'Huiqiang Jiang, Yucheng Li, Chengruidong Zhang, Qianhui Wu, Xufang Luo, Surin Ahn, Zhenhua Han, Amir Abdi, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, Lili Qiu', 'Conference': 'NeurIPS 2024'}, {'Paper': 'Parrot: Efficient Serving of LLM-based Applications with Semantic Variable', 'Authors': 'Chaofan Lin, Zhenhua Han*, Chengruidong Zhang, Yuqing Yang, Fan Yang, Chen Chen, Lili Qiu', 'Conference': 'USENIX OSDI 2024'}, {'Paper': 'PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation', 'Authors': 'Ningxin Zheng, Huiqiang Jiang, Quanlu Zhang*, Zhenhua Han*, Lingxiao Ma, Yuqing Yang*, Fan Yang, Chengruidong Zhang, Lili Qiu, Mao Yang, Lidong Zhou', 'Conference': 'ACM SOSP 2023'}, {'Paper': 'Optimizing Dynamic Neural Networks with Brainstorm', 'Authors': 'Weihao Cui, Zhenhua Han, Lingji Ouyang, Yichuan Wang, Ningxin Zheng, Lingxiao Ma, Yuqing Yang, Fan Yang, Jilong Xue, Lili Qiu, Lidong Zhou, Quan Chen, Haisheng Tan, Minyi Guo', 'Conference': 'USENIX OSDI 2023'}, {'Paper': 'Dynamic Resource Allocation for Deep Learning Clusters with Separated Compute and Storage', 'Authors': 'Mingxia Li, Zhenhua Han, Chi Zhang, Ruiting Zhou, Yuanchi Liu, Haisheng Tan', 'Conference': 'IEEE INFOCOM 2023'}, {'Paper': 'ElasticFlow: An Elastic Serverless Training Platform for Distributed Deep Learning', 'Authors': 'Diandian Gu, Yihao Zhao, Yinmin Zhong, Yifan Xiong, Zhenhua Han, Peng Cheng, Fan Yang, Gang Huang, Xin Jin, Xuanzhe Liu', 'Conference': 'ASPLOS 2023'}, {'Paper': 'SiloD: A Co-design of Caching and Scheduling for Deep Learning Clusters', 'Authors': 'Hanyu Zhao*, Zhenhua Han*, Zhi Yang, Quanlu Zhang, Mingxia Li, Fan Yang, Qianxi Zhang, Binyang Li, Yuqing Yang, Lili Qiu, Lintao Zhang, Lidong Zhou', 'Conference': 'EuroSys 2023'}, {'Paper': 'PilotFish: Harvesting Free Cycles of Cloud Gaming with Deep Learning Training', 'Authors': 'Wei Zhang, Binghao Chen, Zhenhua Han, Quan Chen, Peng Cheng, Fan Yang, Ran Shu, Yuqing Yan, Minyi Guo', 'Conference': 'USENIX ATC ‘22'}, {'Paper': 'HiveD: Sharing a GPU Cluster for Deep Learning with Guarantees', 'Authors': 'Hanyu Zhao*, Zhenhua Han*, Zhi Yang, Quanlu Zhang, Fan Yang, Lidong Zhou, Mao Yang, Francis C.M. Lau, Yuqi Wang, Yifan Xiong, Bin Wang', 'Conference': 'USENIX OSDI 2020'}, {'Paper': 'Retiarii: A Deep Learning Exploratory-Training Framework', 'Authors': 'Quanlu Zhang, Zhenhua Han, Fan Yang, Yuge Zhang, Zhe Liu, Mao Yang, Lidong Zhou', 'Conference': 'USENIX OSDI 2020'}, {'Paper': 'Automating Cloud Deployment for Deep Learning Inference of Real-time Online Services', 'Authors': 'Yang Li*, Zhenhua Han*, Quanlu Zhang, Zhenhua Li, Haisheng Tan', 'Conference': 'IEEE INFOCOM 2020, IEEE/ACM Transaction on Networking, 2023'}, {'Paper': 'Gandiva: Introspective Cluster Scheduling for Deep Learning', 'Authors': 'Wencong Xiao, Romil Bhardwaj, Ramachandran Ramjee, Muthian Sivathanu, Nipun Kwatra, Zhenhua Han, Pratyush Patel, Xuan Peng, Hanyu Zhao, Quanlu Zhang, Fan Yang, Lidong Zhou', 'Conference': 'USENIX OSDI 2018'}]
Background
Research Interests: Resource management, systems for machine learning, and cloud computing. Currently a Senior Researcher at Microsoft Research Asia (Shanghai).