Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Academic Achievements
2025: ASAP: an Agentic Solution to Auto-optimize Performance of Large-Scale LLM Training (NeurIPS ML for Systems Workshop)
2025: Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities (arXiv preprint arXiv:2507.06261)
2025: Profile-Guided Quantization: a compiler solution to automate quantization for efficient LLM training (ISCA MLArchSys workshop)
2025: SSDTrain: An Activation Offloading Framework to SSDs for Faster Large Language Model Training (DAC)
2024: ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization (NeurIPS)
2024: New Solutions on LLM Acceleration, Optimization, and Application (DAC)
2024: AutoAI2C: An Automated Hardware Generator for DNN Acceleration on both FPGA and ASIC (IEEE Transactions on Computer-Aided Design)
Research Experience
Staff Software Engineer at Google, working on large-scale AI systems to enable efficient Gemini training and serving on TPUs.
Education
Received Ph.D. from the University of Illinois Urbana-Champaign (UIUC) in 2022, supervised by Prof. Deming Chen, and collaborated closely with Prof. Wen-mei Hwu and Prof. Junjun Xiong; B.S. and M.S. from UESTC in Chengdu, China.
Background
AI Systems Engineer & Researcher, with research interests in AI Systems, Energy-efficient Computing, and Hardware/Software Co-design.