Senior Software Engineer, GPU Performance

Google
Sunnyvale, CA, USA / New York, NY, USA / Seattle, WA, USA

About the job

Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google’s needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward. While known for pioneering work with TPUs, GPUs are an equally vital and rapidly expanding frontier within Google's machine learning infrastructure. GPUs are indispensable to Google’s ever-evolving landscape for strategic, pragmatic, and performance-driven reasons — ensuring top performance for our ML models, adapting to ML workloads, achieving results, and influencing next-gen GPU architectures via strategic partnerships. In recognition of hardware as a strength, Google’s Core ML organization is heavily invested in growing a powerhouse team of GPU experts, and we invite you to be at its vanguard! This is your opportunity to move beyond incremental improvements and architect truly transformative solutions, shaping the future of AI and accelerated computing for Google and the world.

Responsibilities

Build optimizations for the latest generation of GPUs that power Google’s most critical products and services, impacting billions of users worldwide.

Address the most challenging performance bottlenecks through Google’s unparalleled access to the latest generation of GPUs, tooling, and a decade of experience building AI accelerators.

Drive optimizations across Google’s GPU software stack from ML compiler cost model design to optimizing high performance GPU kernels to cross node model serving configurations.

Influence the technical direction of the GPU software ecosystem at Google by collaborating with ML, compiler design, and systems architecture teams.

Influence the deployment of Google’s GPU fleet by working with various product teams across Google.

Qualifications

Minimum

Bachelor’s degree or equivalent practical experience.

5 years of experience with software development in one or more programming languages.

3 years of experience testing, maintaining, or launching software products, and 1 year of experience with software design and architecture.

Experience with modern GPU architectures (NVIDIA, AMD, or other AI accelerators), memory hierarchies, and performance bottlenecks.

Experience low-level GPU programming (CUDA, Triton, CUTLASS, etc.) and performance engineering techniques.

Experience with modern LLMs and their deployment on AI accelerators.

Preferred

Master's degree or PhD in Computer Science or related technical field.

5 years of experience with data structures and algorithms.

1 year of experience in a technical leadership role.

Experience with compiler optimization, code generation, and runtime systems for GPU architectures (OpenXLA, MLIR, Triton, etc.).