Researcher, Pretraining Safety

About the job

The Pretraining Safety team is pioneering how safety is built into models before they reach post-training and deployment. In this role, you will work throughout the full stack of model development with a focus on pre-training: - Identify safety-relevant behaviors as they first emerge in base models - Evaluate and reduce risk without waiting for full-scale training runs - Design architectures and training setups that make safer behavior the default - Strengthen models by incorporating richer, earlier safety signals

Responsibilities

Develop new techniques to predict, measure, and evaluate unsafe behavior in early-stage models

Design data curation strategies that improve pretraining priors and reduce downstream risk

Explore safe-by-design architectures and training configurations that improve controllability

Introduce novel safety-oriented loss functions, metrics, and evals into the pretraining stack

Work closely with cross-functional safety teams to unify pre- and post-training risk reduction

Qualifications

Minimum

Have experience developing or scaling pretraining architectures (LLMs, diffusion models, multimodal models, etc.)

Are comfortable working with training infrastructure, data pipelines, and evaluation frameworks (e.g., Python, PyTorch/JAX, Apache Beam)

Enjoy hands-on research — designing, implementing, and iterating on experiments

Enjoy collaborating with diverse technical and cross-functional partners (e.g., policy, legal, training)

Are data-driven with strong statistical reasoning and rigor in experimental design

Value building clean, scalable research workflows and streamlining processes for yourself and others