Published work on efficient parallelization layouts for large-scale distributed model training, which won Best Paper at the WANT@NeurIPS workshop; 'I Don't Know' token paper accepted at NeurIPS 2024; FOCUS paper accepted at EMNLP 2023.
Research Experience
Interned at Apple in Barcelona working on multilingual post-training with reinforcement learning; completed an ML research internship with InstaDeep in Paris focusing on multimodal generative protein design; worked as a software engineer at SAP in Newport Beach, California after undergrad.
Education
ELLIS PhD student at the Hasso Plattner Institute and ELLIS Unit Potsdam, advised by Gerard de Melo, and co-advised by Desmond Elliott at the University of Copenhagen.
Background
Research interests include multilingual NLP, tokenizers, and embeddings. Particularly focused on 'freeing' pretrained large language models from their static vocabularies by developing better methods for tokenizer transfer and embedding initialization of new tokens. Also interested in computationally efficient training of large language models, uncertainty quantification with a special 'I Don't Know' token, and conditional image generation using GANs.
Miscellany
Enjoys playing the saxophone, chess, alpine hiking, and solo traveling in his spare time.