Senior Software Dev Engineer, EC2 Nitro

Amazon
USA, WA, Seattle2026-02-19ONSITE

About the job

Join the EC2 Nitro Machine Learning Systems team to revolutionize supercomputing in the cloud. We're seeking an experienced Software Development Engineer to build and optimize infrastructure powering the most computationally intensive AI/ML workloads. In this role, you'll establish EC2 as the definitive source for best-known-configurations across diverse ML applications while influencing future accelerated platform designs.

Responsibilities

Design and implement scalable performance measurement infrastructure that serves as the foundation for ML benchmarking across AWS, incorporating critical metrics like tokens/second, latency, and accelerator utilization

Lead technical projects establishing EC2 as the definitive source for ML performance best practices across diverse applications including LLMs, multimodal systems, and emerging model architectures

Develop and maintain comprehensive regression testing systems that validate performance across major component releases including frameworks, firmware, drivers, and networking infrastructure

Collaborate with hardware engineering teams to influence future accelerator platform designs based on performance insights gathered from state-of-the-art research and customer workloads

Build customer relationships by investigating complex performance challenges, developing solutions, and publishing actionable best practices through multiple channels

Qualifications

Minimum

5+ years of non-internship professional software development experience

5+ years of programming with at least one software programming language experience

5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience

Experience as a mentor, tech lead or leading an engineering team

Knowledge of Machine Learning and LLM fundamentals, including transformer architecture, training/inference lifecycles, and optimization techniques

Preferred

5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience

Bachelor's degree in computer science or equivalent

Knowledge of ML frameworks including JAX, PyTorch, vLLM, SGLang, Dynamo, TorchXLA, and TensorRT

Knowledge of machine learning model architecture and inference