Cathy Jiao
Scholar

Cathy Jiao

Google Scholar ID: fd1et9QAAAAJ
Carnegie Mellon University
Natural Language ProcessingData AttributionMachine LearningDeep Learning
Citations & Impact
All-time
Citations
129
 
H-index
4
 
i10-index
2
 
Publications
11
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • Selected Publications: DATE-LM: Benchmarking Data Attribution Evaluation for Large Language Models (NeurIPS 2025); Fairshare Data Pricing for Large Language Models (NeurIPS 2025). See Google Scholar for all publications.
Research Experience
  • Spent a wonderful summer at Spotify Research, hosted by Paul Bennett; previously worked in industry on machine learning and deep learning applications for natural language processing.
Education
  • PhD: Carnegie Mellon University, School of Computer Science, Language Technologies Institute, advised by Chenyan Xiong; Master's: CMU LTI, advised by Maxine Eskenazi and Aaron Steinfeld; Bachelor's: University of British Columbia, BS in CS & Math, graduated with distinction.
Background
  • Research interests: data-centric AI, particularly data attribution. Professional field: designing methods and frameworks to better understand, curate, and evaluate the data for large language models. Bio: aims to develop frameworks that make data usage more transparent, reliable, and impactful for both research and deployment.
Miscellany
  • In her spare time, she enjoys cooking and biking around Pittsburgh.
Co-authors
0 total
Co-authors: 0 (list not available)