Mostofa Patwary
Scholar

Mostofa Patwary

Google Scholar ID: 0rt4tbMAAAAJ
Director, Applied Deep Learning Research, NVIDIA
Natural Language ProcessingLarge Scale Deep LearningHigh Performance ComputingParallel
Citations & Impact
All-time
Citations
10,446
 
H-index
35
 
i10-index
57
 
Publications
20
 
Co-authors
13
list available
Resume (English only)
Academic Achievements
  • Published numerous high-impact papers at venues including NeurIPS 2022, ACL 2021, EMNLP 2020, EACL 2023, etc.
  • Contributed to Megatron-Turing NLG 530B, the world’s largest and most powerful generative language model at the time
  • Proposed the Minitron approach for LLM pruning and distillation (ArXiv 2024)
  • Contributed to StarCoder 2 and The Stack v2 (ArXiv 2024)
  • Paper 'Scaling Language Model Training to a Trillion Parameters Using Megatron' received Best Student Paper Award at SC 2021
  • Megatron-LM paper has ~300 citations