Browse publications on Google Scholar (top-right) ↗
Resume (English only)
Background
Research Scientist in the Language Team at Google DeepMind
Currently working on data quality for pretraining and fine-tuning stages of large language models (Gemini)
Passionate about training data attribution at scale—measuring how each output is influenced by each training example
Aims to use attribution insights to improve model quality, enable data curation with model feedback, and uncover causal links between training data and model behavior
Has long worked on interpretability and model understanding for language and vision models, including feature- and example-level attribution, counterfactual analysis, and concepts in embedding spaces
Enjoys engineering and has built large-scale AI/ML infrastructure for model and dataset debugging, such as Google Cloud XAI and model internals-based retrieval over billions of examples