About the job
The Waymo ML Frameworks & Efficiency team works with Research and Production teams to develop and deploy models in Perception and Planning that are core to our autonomous driving software. We help our partners by offering the best frameworks for the entire model development lifecycle and efficiency solutions for model execution. They are geared towards both scaling models and solving problems unique to ML for autonomous driving. We are looking for engineers with ML frameworks or ML systems expertise to help us improve compute efficiency on both cloud and car. You’ll work across the entire ML stack, from deep learning model architectures, ML frameworks (e.g. JAX, XLA, etc.), to accelerator runtime. You will work closely with ML modeling teams to drive large scale and efficient model training and inference.
Responsibilities
Optimize distributed ML systems for high performance on TPUs and GPUs clusters, and applying SPMD, MPMD, FSDP, etc techniques to scale our model training.
Improve accelerator FLOPS efficiency of ML workload, including improving compiler optimizations (e.g. XLA), authoring low-level kernels (e.g. Pallas, Triton, etc.) and enabling low-precision computation.
Develop new neural model architectures (e.g., sparse architectures), decoding strategies (e.g., speculative decoding), etc. for improving training/inference performance on modern TPU and GPU architectures.
Evaluate and integrate open source community and Google SOTA technologies to enhance the performance and scalability of ML workloads.
Promote best practices for distributed systems architecture and contribute to technical leadership within the team.
Qualifications
Minimum
B.S. in Computer Science, Math, or 8+ years equivalent real-world experience.
Proficient in distributed systems design with an understanding of ML efficiency.
Experience with ML frameworks, including TensorFlow, JAX, XLA.
Solid programming skills in Python and C++.
Practical familiarity with profiling tools to uncover performance bottlenecks.
Preferred
MS in Computer Science, Math
Familiarity with ML frameworks like Pallas and Triton