Staff Machine Learning Engineer, Inference Team

Google
Sunnyvale, CA, USA / Kirkland, WA, USA

About the job

Our mission is to provide the best possible cloud-based ML inference solutions to customers. We are committed to building and supporting products that allow customers to quickly decide to use, build, and support model serving solutions on Cloud with predictable performance. We also strive to understand the best ways to serve and optimize models on accelerators, and help our customers achieve their AI goals. In this role, you will be working on an emergent product with potential to change how customers use Google infrastructure for machine learning inference.

Responsibilities

Participate in feature investment and stable stack for serving solutions for inference on the latest Tensor Processing Unit (TPU) New Product Introduction (NPIs.)

Develop techniques to improve long context support in inference serving stack.

Prepare and optimize very large reference models, demonstrating single-host and multi-host inference solutions, especially at large scale.

Collaborate with the ML research, ML performance, model optimization tooling, and other optimization teams.

Participate in ML performance inference submissions.

Qualifications

Minimum

Bachelor’s degree or equivalent practical experience.

8 years of experience in software development.

5 years of experience testing, and launching software products, and 3 years of experience with software design and architecture.

5 years of experience with one or more of the following: Speech/audio (e.g., technology duplicating and responding to the human voice), reinforcement learning (e.g., sequential decision making), ML infrastructure, or specialization in another ML field.

5 years of experience with ML design and ML infrastructure (e.g., model deployment, model evaluation, performance evaluation, data processing, debugging, fine tuning).

Preferred

Master’s degree or PhD in Engineering, Computer Science, or a related technical field.

8 years of experience with data structures and algorithms.

3 years of experience in a technical leadership role leading project teams and setting technical direction.

3 years of experience working in a complex, matrixed organization involving cross-functional, or cross-business projects.