Scholar
Yonggan Fu
Google Scholar ID: pt3GfXcAAAAJ
NVIDIA Research
Efficient AI
Efficient Language Models
Model Compression
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
1,825
H-index
23
i10-index
35
Publications
20
Co-authors
12
list available
Contact
No contact links provided.
Publications
25 items
Fast-dVLM: Efficient Block-Diffusion VLM via Direct Conversion from Autoregressive VLM
2026
Cited
0
Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed
2025
Cited
0
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
2025
Cited
0
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models
2025
Cited
0
TiDAR: Think in Diffusion, Talk in Autoregression
2025
Cited
0
Fast-dLLM v2: Efficient Block-Diffusion LLM
2025
Cited
0
Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment
2025
Cited
0
LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models
2025
Cited
0
Load more
Resume (English only)
Co-authors
12 total
Yingyan (Celine) Lin
Associate Professor, Georgia Institute of Technology
Chaojian Li
Hong Kong University of Science and Technology
Zhongzhi Yu
Nvidia Research
Zhangyang (Atlas) Wang
XTX Markets & University of Texas at Austin
Meng Li
Peking University; Ex-Facebook
Vikas Chandra
Meta
Pavlo Molchanov
NVIDIA Research
Yang Zhang
MIT-IBM Watson AI Lab
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up