Scholar

Zhongzhi Yu

Google Scholar ID: KjvcaBQAAAAJ

Nvidia Research

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

839

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailzyu401@gatech.edu CVOpen ↗GitHubOpen ↗LinkedInOpen ↗

Publications

12 items

ACE-RTL: When Agentic Context Evolution Meets RTL-Specialized LLMs

2026

Cited

LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models

2025

Cited

Spec2RTL-Agent: Automated Hardware Code Generation from Complex Specifications Using LLM Agent Systems

2025

Cited

ScaleRTL: Scaling LLMs with Reasoning Data and Test-Time Compute for Accurate RTL Code Generation

2025

Cited

GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models

2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD) · 2023

Cited

Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing

Neural Information Processing Systems · 2022

Cited

ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design

International Symposium on High-Performance Computer Architecture · 2022

Cited

EyeCoD: eye tracking system acceleration via flatcam-based algorithm & accelerator co-design

International Symposium on Computer Architecture · 2022

Cited

Resume (English only)

Academic Achievements

June 2024: MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation received the Best Paper Award at the inaugural IEEE LAD 2024 workshop on LLM-Aided Design.
May 2024: Unveiling and Harnessing Hidden Attention Sinks accepted to ICML 2024.
February 2024: EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive Layer Tuning and Voting accepted by DAC 2024.
July 2023: GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models accepted by ICCAD 2023.
July 2023: Gen-NeRF Demo won 2nd place in the University Demo Best Demonstration Award at DAC 2023.
April 2023: Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modularized Learning accepted by ICML 2023.
February 2023: Hint-Aug: Few-shot ViT tuning framework accepted by CVPR 2023.
February 2023: NetBooster: Efficiency boosting framework for tiny neural networks accepted by DAC 2023.

Research Experience

Currently a Research Scientist at NVIDIA, focusing on on-the-fly inference upgrades for foundation models and co-designing AI accelerators with LLM assistance.

Education

Earned a Ph.D. in Computer Science from Georgia Tech, advised by Prof. Yingyan (Celine) Lin; holds an M.S. from Columbia University and a B.Eng. from Zhejiang University. Has collaborated with MIT-IBM Watson AI Lab.

Background

Research interests include designing efficient learning algorithms for large language models, with a focus on inference calibration, adaptive tuning, and human-in-the-loop hardware design. His work aims to bridge LLM foundations with practical deployment on data- and compute-constrained platforms.

Co-authors

6 total