Senior Machine Learning Engineer – VLM/LLM Evaluation

Waymo
Mountain View, CA | San Francisco, CA | Kirkland, WA | New York City, NY / Kirkland (US-KIR-6THD), Kirkland, Washington, United States / Mountain View (US-MTV-EMF680), Mountain View, California, United States2026-02-23

About the job

Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver—The World's Most Experienced Driver™—to improve access to mobility while saving thousands of lives now lost to traffic crashes. The Waymo Driver powers Waymo’s fully autonomous ride-hail service and can also be applied to a range of vehicle platforms and product use cases. The Waymo Driver has provided over ten million rider-only trips, enabled by its experience autonomously driving over 100 million miles on public roads and tens of billions in simulation across 15+ U.S. states. The mission of the Waymo AI Foundations team is to develop machine learning solutions addressing open problems in autonomous driving, towards the goal of safely operating Waymo vehicles in dozens of cities and under all driving conditions. As part of our work, we also initiate and foster collaborations with other research teams in Alphabet. AI Foundations areas that we are currently focusing on include reinforcement learning, learning from demonstration, generative modeling, Bayesian inference, hierarchical learning, and robust evaluation. This role follows a hybrid work schedule and you will report to a Senior Staff Software Engineer.

Responsibilities

Work with a creative team of people who help to build the state-of-the-art Foundation Models that are used throughout Waymo’s systems, both onboard autonomous vehicles and offboard in simulation

Drive the development or significantly contribute to end-to-end evaluation systems and benchmarks for Waymo Foundation models, encompassing the entire life-cycle from pre-training and supervised fine-tuning (SFT) to reinforcement learning (RL), for evaluating the quality, safety, and realism of embodied AI agents

Partner with cross-functional teams within the organization to land innovative tech in production

Implement and extend large large scale data and evaluation pipelines.

Qualifications

Minimum

Bachelor or Master’s degree in Computer Science, similar technical field of study, or equivalent practical experience

Experience in ML engineering and applied Deep Learning

Experience with large scale distributed system

Proficient programming skills (eg: Python, C/C++)

Preferred

ML infra experience: training, evaluating and deploying ML models at scale

Deep learning experience, especially with generative models, e.g., LLMs/VLMs, and/or reinforcement learning

Proficiency and in-depth knowledge of the inner workings of an ML framework (e.g. Pytorch, JAX, Tensorflow)