Selected Publications: DATE-LM: Benchmarking Data Attribution Evaluation for Large Language Models (NeurIPS 2025); Fairshare Data Pricing for Large Language Models (NeurIPS 2025). See Google Scholar for all publications.
Research Experience
Spent a wonderful summer at Spotify Research, hosted by Paul Bennett; previously worked in industry on machine learning and deep learning applications for natural language processing.
Education
PhD: Carnegie Mellon University, School of Computer Science, Language Technologies Institute, advised by Chenyan Xiong; Master's: CMU LTI, advised by Maxine Eskenazi and Aaron Steinfeld; Bachelor's: University of British Columbia, BS in CS & Math, graduated with distinction.
Background
Research interests: data-centric AI, particularly data attribution. Professional field: designing methods and frameworks to better understand, curate, and evaluate the data for large language models. Bio: aims to develop frameworks that make data usage more transparent, reliable, and impactful for both research and deployment.
Miscellany
In her spare time, she enjoys cooking and biking around Pittsburgh.