International Symposium on Computer Architecture · 2022
Cited
9
Resume (English only)
Academic Achievements
June 2024: MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation received the Best Paper Award at the inaugural IEEE LAD 2024 workshop on LLM-Aided Design.
May 2024: Unveiling and Harnessing Hidden Attention Sinks accepted to ICML 2024.
February 2024: EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive Layer Tuning and Voting accepted by DAC 2024.
July 2023: GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models accepted by ICCAD 2023.
July 2023: Gen-NeRF Demo won 2nd place in the University Demo Best Demonstration Award at DAC 2023.
April 2023: Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modularized Learning accepted by ICML 2023.
February 2023: Hint-Aug: Few-shot ViT tuning framework accepted by CVPR 2023.
February 2023: NetBooster: Efficiency boosting framework for tiny neural networks accepted by DAC 2023.
Research Experience
Currently a Research Scientist at NVIDIA, focusing on on-the-fly inference upgrades for foundation models and co-designing AI accelerators with LLM assistance.
Education
Earned a Ph.D. in Computer Science from Georgia Tech, advised by Prof. Yingyan (Celine) Lin; holds an M.S. from Columbia University and a B.Eng. from Zhejiang University. Has collaborated with MIT-IBM Watson AI Lab.
Background
Research interests include designing efficient learning algorithms for large language models, with a focus on inference calibration, adaptive tuning, and human-in-the-loop hardware design. His work aims to bridge LLM foundations with practical deployment on data- and compute-constrained platforms.