Scholar
Dylan Sam
Google Scholar ID: 43ffAwcAAAAJ
PhD Student, Carnegie Mellon University
Machine Learning
Follow
Homepage
↗
Google Scholar
↗
Citations & Impact
All-time
Citations
158
H-index
7
i10-index
5
Publications
19
Co-authors
10
list available
Contact
Twitter
Open ↗
GitHub
Open ↗
Publications
6 items
When Should We Introduce Safety Interventions During Pretraining?
arXiv.org · 2026
Cited
0
Evaluating Language Model Reasoning about Confidential Information
2025
Cited
0
Safety Pretraining: Toward the Next Generation of Safe AI
2025
Cited
0
Analyzing Similarity Metrics for Data Selection for Language Model Pretraining
2025
Cited
0
Predicting the Performance of Black-box LLMs through Self-Queries
2025
Cited
0
Finetuning CLIP to Reason about Pairwise Differences
arXiv.org · 2024
Cited
4
Resume (English only)
Academic Achievements
[May 2025] Gave a talk at UMD on 'Safety Pretraining', creating natively safe LLMs during pretraining
[Feb 2025] Shared Google internship work on analyzing similarity metrics in LLM pretraining data curation
[Jan 2025] Released new work on black-box monitoring of LLM performance and behaviors
[Apr 2024] Presented 'Auditing Fairness under Unobserved Confounding' at AISTATS
[Apr 2024] Presented work on generalization bounds for prompt engineering in VLMs at ICLR
[Jul 2023] Gave a talk at the KLR workshop @ ICML 2023 on learning data-driven priors for BNNs incorporating interpretable domain knowledge
Research Experience
Student Researcher at Google Research
Research Intern at Amazon AWS
Research Intern at Bosch Center for AI
Research Intern at NASA JPL
Summer 2024: Working at Google Research in NYC on data curation for LLM pretraining
Background
Final-year PhD student in the Machine Learning Department at Carnegie Mellon University (CMU)
Research focuses on making language models behave in safe, controllable, and predictable ways
Current work includes curating safer training data and monitoring models for harmful behavior
Previously worked on aligning models using weak supervision or interpretable domain knowledge, and on generalization
Supported by an NSF Graduate Research Fellowship
Currently collaborating with Gray Swan AI
Co-authors
10 total
Zico Kolter
Carnegie Mellon University
Stephen H. Bach
Assistant Professor of Computer Science, Brown University
Alessio Mazzetto
Brown University
Yiding Jiang
Carnegie Mellon University
Rattana Pukdee
Carnegie Mellon University
Marc Finzi
Postdoc at Carnegie Mellon University
Alex Robey
Postdoc, Carnegie Mellon University
Andy Zou
PhD Student, Carnegie Mellon University
×
Welcome back
Sign in to Agora
Welcome back! Please sign in to continue.
Email address
Password
Forgot password?
Continue
Do not have an account?
Sign up