Senior/Staff ML Engineer, 3D/4D World Modeling, Simulation

Waymo
Mountain View, CA, USA / Mountain View (US-MTV-EMF680), Mountain View, California, United States2025-03-06

About the job

Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. The Simulator Team at Waymo builds state-of-the-art simulations of realistic environments for testing, training, and validation of the Waymo Driver. We develop industry-leading simulation solutions using advanced generative and reconstructive ML algorithms, to model the real world, encompassing realistic agents, roads, traffic systems, weather, and the full sensor suite (Camera, Lidar, Radar).

Responsibilities

Lead the design, development and deployment of cutting-edge 4D world models and generative systems for ultra-realistic and controllable sensor and semantics generation for simulation use cases at waymo.

Architect and implement scalable and robust ML pipelines for training, evaluating, and deploying large-scale generative models into our simulation infrastructure, including techniques like model distillation and quantization.

Build and scale production-ready video generation techniques (e.g., Diffusion, Flow Matching) to create dynamic and interactive simulation environments.

Apply Vision Language Models (VLMs) to enhance the semantic understanding and controllability of our world simulation products.

Partner with world class research teams across Waymo and Alphabet to leverage State-of-The-Art research in 4D world modeling and generative AI into robust, production-ready solutions.

Mentor and provide technical guidance to other engineers on the team.

Qualifications

Minimum

MS or PhD in Computer Science, Machine Learning, Robotics, or a related field.

5+ years of experience in ML engineering and applied Deep Learning, with a strong portfolio of shipped products or publication record.

Proven experience in developing and training large-scale generative models for video generation (e.g., Diffusion models, Flow Matching) or Vision Language Models (VLMs) and their applications.

Deep expertise in 3D World Modeling or 3D computer vision.

Familiarity with 3D reconstruction and rendering techniques (e.g., 3D Gaussian Splatting).

Strong programming skills in Python and experience with ML frameworks such as Jax/Flax, PyTorch or Tensorflow.

Preferred

PhD and a strong track record of delivering impactful ML products in 3D generative models, world models, or video generation..

Experience in simulating sensor data (Camera, Lidar, Radar) and/or semantic scenes.

Experience with autonomous systems, robotics, or autonomous vehicle simulation.

Experience in training and optimizing large scale models on GPU/TPU clusters for efficient serving.

Experience in C++ for production systems.