Sr. Software Development Engineer, Frontier AI & Robotics

About the job

Join the next revolution in robotics at Amazon's Frontier AI & Robotics team, where you'll work alongside world-renowned AI pioneers like Pieter Abbeel, Rocky Duan, and Peter Chen to make breakthrough foundation models run at production scale. As a Senior Machine Learning Engineer embedded in our science team, you'll be instrumental in transforming innovative research into high-performance production systems. You'll collaborate directly with scientists to optimize large-scale transformer architectures for robotics applications, leveraging your expertise in CUDA and TensorRT to achieve unprecedented inference efficiency at Amazon scale.

Responsibilities

Drive inference optimization strategies for large-scale foundation models using TensorRT, CUDA, and other NVIDIA tools

Collaborate closely with scientists to influence model architectures for optimal hardware utilization

Design and implement efficient compilation pipelines for complex transformer architectures

Develop comprehensive benchmarking frameworks to measure and optimize model performance

Build robust monitoring solutions to ensure reliable model serving at scale

Explore and evaluate emerging optimization techniques including ONNX Runtime and other ML compilers

Maintain high engineering standards through proper testing, documentation, and code review practices

Qualifications

Minimum

Bachelor's degree in computer science or equivalent

5+ years of non-internship professional software development experience

5+ years of programming with at least one software programming language experience

5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience

Experience as a mentor, tech lead or leading an engineering team

Strong expertise in Python, C++ and CUDA programming

Experience with TensorRT or similar ML optimization frameworks

Track record of optimizing ML models for production

Preferred

Expertise in NVIDIA's ML stack (cuDNN, CUDA Graph, etc.)

Experience with ML compilers (ONNX Runtime, TVM, etc.)

Experience with transformer model optimization

Background in performance profiling and optimization

Experience working directly with research teams

Track record of building robust monitoring systems

Experience with large-scale ML serving systems