ML Accelerator Architect

Waymo
Mountain View, CA, US / New York City, NY, US / Mountain View (US-MTV-EMF680), Mountain View, California, United States2026-01-05

About the job

Waymo's Compute Team is tasked with a critical and exciting mission: We deliver the compute platform responsible for running the fully autonomous vehicle's software stack. To achieve our mission, we architect and create high-performance custom silicon; we develop system-level compute architectures that push the boundaries of performance, power, and latency; and we collaborate closely with many other teammates to ensure we design and optimize hardware and software for maximum performance. We are a multidisciplinary team seeking curious and talented teammates to work on one of the world's highest performance automotive compute platforms.

Responsibilities

Analyze workloads and map them efficiently to hardware, proposing novel HW-friendly implementations and projecting performance

Architect, simulate and design amazing machine learning solutions for our autonomous driving technology

Work closely with compiler and model developers to influence engineering trade-offs and future model architectures

Build scalable tools for simulator modeling and performance evaluation

Interact with cross-functional engineering teams to identify opportunities and requirements

Qualifications

Minimum

BS degree in Computer Science or Computer Engineering or similar relevant technical field, or equivalent practical experience

3+ years on designing/architecting complex, high performance architectures - CPUs, GPUs and/or ML accelerators - in the industry or through doctoral research

1+ years experience with machine learning architectures, acceleration and model optimization

Strong C++ programming and algorithmic problem solving skills

Preferred

1+ years modeling high performance architectures in cycle-aware simulators

Track record of analyzing workloads and architecting, delivering novel HW+SW solutions to vastly improve performance, efficiency

Familiarity with ML model architectures and their compute characteristics (bottlenecks, optimization opportunities)

Experience with microarchitecture design (SystemVerilog or HLS)