Research Scientist, RL for Autonomous Planning & World Modeling

About the job

Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. The Waymo AI Foundations team develops machine learning solutions addressing open problems in autonomous driving, including reinforcement learning, learning from demonstration, generative modeling, Bayesian inference, hierarchical learning, and robust evaluation. In this hybrid role, you will report to a Principal Scientist.

Responsibilities

Participate in Waymo’s Foundation World Model post-training and evaluation

Research and develop cutting edge RL and Distillation techniques for Autonomous Vehicle Trajectory Planning

Integrate emerging research from the broader AI community into Waymo’s internal RL infrastructure, conducting rigorous ablations to identify and scale the most promising methods

Partner with engineering and research teams across Waymo to share recipes, techniques, and post-training best practices to accelerate our collective know-how

Qualifications

Minimum

PhD or Masters in Computer Science, Machine Learning, Robotics, or a similar technical field; with 3+ years of industry or post-doc research experience in Reinforcement Learning or Foundation Models

Demonstration of original contributions to the field through high-impact publications (ArXiv, peer-reviewed conferences like NeurIPS/ICLR/CVPR), technical blog posts, or significant open-source contributions

Proficiency in implementing model training flows in a scalable, distributed and performant manner such as Data parallel, FSDP and other sharding approaches

A willingness to work with complexity of globally distributed inference infrastructure

Preferred

PhD in Computer Science, Machine Learning, or Robotics, with a research focus on Reinforcement Learning, Foundation Models, or Multi-Modal learning

Extensive experience designing and deploying Reinforcement Learning infrastructure, specifically for on-policy learning or alignment with human preferences

A consistent history of original contributions to the AI community, evidenced by first-author publications at top-tier venues (e.g., NeurIPS, ICLR, ICRA) or maintaining significant open-source ML projects

Experience with large scale (many-machine) training infrastructure and techniques for inference with large models such as model sharding/tensor-parallel