Harshay Shah
Scholar

Harshay Shah

Google Scholar ID: oC8YKjUAAAAJ
MIT
Machine Learning
Citations & Impact
All-time
Citations
775
 
H-index
6
 
i10-index
6
 
Publications
8
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • Published papers: 'Do Language Models Robustly Acquire New Knowledge?' (NeurIPS CCFM, 2025) and 'Parameters vs FLOPs: Scaling Laws for Optimal Sparsity of MoE Language Models' (ICML, 2025 + ICLR SLLM, 2025). The former explores how language models acquire new knowledge through multi-hop reasoning tasks, while the latter investigates the relationship between the number of parameters and compute per example in sparse Mixture-of-Expert models and its impact on model performance.
Research Experience
  • Worked for two years at Microsoft Research with Praneeth Netrapalli and Prateek Jain. Currently a PhD student at MIT EECS.
Education
  • Spent two years at Microsoft Research working with Praneeth Netrapalli and Prateek Jain before starting his PhD. Received a BS in CS and Stats from the University of Illinois at Urbana-Champaign, and has interned at Google Research, Apple MLR, and Akuna Capital. Advised by Aleksander Mądry during his PhD.
Background
  • PhD student at MIT EECS, interested in understanding and steering large-scale machine learning models. Recent work focuses on developing tools for analyzing model behavior via targeted interventions to learning algorithms, training data, in-context information, and learned representations.
Miscellany
  • Enjoys cricket and tennis outside of research.
Co-authors
0 total
Co-authors: 0 (list not available)