Performance Engineer

About the job

Running machine learning (ML) algorithms at our scale often requires solving novel systems problems. As a Performance Engineer, you'll be responsible for identifying these problems, and then developing systems that optimize the throughput and robustness of our largest distributed systems. Strong candidates here will have a track record of solving large-scale systems problems and will be excited to grow to become an expert in ML also.

Responsibilities

Implement low-latency high-throughput sampling for large language models

Implement GPU kernels to adapt our models to low-precision inference

Write a custom load-balancing algorithm to optimize serving efficiency

Build quantitative models of system performance

Design and implement a fault-tolerant distributed system running with a complex network topology

Debug kernel-level network latency spikes in a containerized environment

Qualifications

Minimum

Have significant software engineering or machine learning experience, particularly at supercomputing scale

Are results-oriented, with a bias towards flexibility and impact

Pick up slack, even if it goes outside your job description

Enjoy pair programming (we love to pair!)

Want to learn more about machine learning research

Care about the societal impacts of your work

Preferred

High performance, large-scale ML systems

GPU/Accelerator programming

ML framework internals

OS internals

Language modeling with transformers