Scholar

Harshay Shah

Google Scholar ID: oC8YKjUAAAAJ

MIT

Machine Learning

Citations & Impact

All-time

Citations

775

H-index

i10-index

Publications

Co-authors

Contact

Publications

1 items

2025

Cited

Resume (English only)

Academic Achievements

Published papers: 'Do Language Models Robustly Acquire New Knowledge?' (NeurIPS CCFM, 2025) and 'Parameters vs FLOPs: Scaling Laws for Optimal Sparsity of MoE Language Models' (ICML, 2025 + ICLR SLLM, 2025). The former explores how language models acquire new knowledge through multi-hop reasoning tasks, while the latter investigates the relationship between the number of parameters and compute per example in sparse Mixture-of-Expert models and its impact on model performance.

Research Experience

Worked for two years at Microsoft Research with Praneeth Netrapalli and Prateek Jain. Currently a PhD student at MIT EECS.

Education

Spent two years at Microsoft Research working with Praneeth Netrapalli and Prateek Jain before starting his PhD. Received a BS in CS and Stats from the University of Illinois at Urbana-Champaign, and has interned at Google Research, Apple MLR, and Akuna Capital. Advised by Aleksander Mądry during his PhD.

Background

PhD student at MIT EECS, interested in understanding and steering large-scale machine learning models. Recent work focuses on developing tools for analyzing model behavior via targeted interventions to learning algorithms, training data, in-context information, and learned representations.

Miscellany

Co-authors

0 total

Co-authors: 0 (list not available)