Mansheej Paul
Scholar

Mansheej Paul

Google Scholar ID: gmZt3VgAAAAJ
Research Scientist, Databricks
Citations & Impact
All-time
Citations
1,563
 
H-index
11
 
i10-index
11
 
Publications
19
 
Co-authors
0
 
Resume (English only)
Academic Achievements
  • Publications can be found on Google Scholar.
Research Experience
  • Research scientist at Mosaic AI Research, Databricks; research intern at FAIR, Meta AI; worked on the science of deep learning through the lens of data, loss landscapes, and neural tangent kernels.
Education
  • Ph.D. in Applied Physics from Stanford University, advised by Surya Ganguli.
Background
  • Research interests span pre-training and post-training LLMs with a focus on optimizing data quality, distribution, and curricula. Currently building synthetic data pipelines to scale inference compute, create diverse generations, and develop strategies to verify and filter them into high-quality training data. Aims to create reliable, consistent, and trustworthy AI systems through rigorous evaluation of model behavior and how it is shaped by training data properties.
Miscellany
  • Volunteered for SF New Deal, helping research and draft their economic impact report; enjoys social dancing, mostly West Coast Swing.
Co-authors
0 total
Co-authors: 0 (list not available)