About the job
We are looking for a Principal Engineer with deep ML engineering expertise to lead the ML and science engineering effort across the AI Platforms organization at AWS. This is a cross-cutting leadership role spanning the full breadth of our ML development platform: data preparation, model evaluation, model deployment and customization, and agentic AI development experience.
Responsibilities
Design and build production-grade 1P agent architectures including memory management, prompt optimization, tool use, and agentic orchestration systems
Define and own the evaluation strategy for AI Platforms, including LLM-as-Judge frameworks, automated benchmarking, and model quality assessment pipelines
Build ML-driven recommendation systems for model benchmarking, selection, deployment, and customization from Model Hub
Establish data quality evaluation pipelines and synthetic data generation infrastructure to support model training and fine-tuning at scale
Drive requirements and technical roadmap with the science team; translate research prototypes into engineering specifications and production systems
Define the ML engineering technical strategy across AI Platforms, establishing best practices for agent development, model evaluation, and science-to-production pipelines
Mentor senior engineers across the organization; conduct design reviews, technical deep-dives, and bar-raising interviews
Present technical strategy and roadmap to VP-level stakeholders; represent AI Platforms ML engineering in cross-organizational technical forums
Qualifications
Minimum
Experience in software development with a significant focus on ML engineering or applied AI systems
Experience designing and shipping production-grade ML systems or AI agents that operate at scale (not just prototype or research systems)
Experience with model evaluation, benchmarking, or quality assessment of ML/AI systems in a production environment
Experience with recommendation systems, retrieval systems, or similar ML systems running in production
Demonstrated ability to work across multiple teams as a technical leader, influencer, or architect
Preferred
Master’s or PhD in Machine Learning, Artificial Intelligence, Computer Science, or a related quantitative field
Experience building production LLM-powered agents including memory management, prompt optimization, tool use, and multi-step reasoning
Experience with LLM-as-Judge, automated evaluation methodologies, or building evaluation infrastructure for generative AI systems
Experience with prompt engineering and prompt optimization at scale (beyond single-model, to fleet-scale optimization)
Experience with synthetic data generation, data quality evaluation, or data labeling pipelines for ML model training
Track record of driving science-engineering collaboration: working with research scientists, interpreting research outputs, and translating findings into production systems
Experience leading technical strategy across multiple engineering teams or sub-organizations
Publications, patents, or significant open-source contributions in ML/AI (NeurIPS, ICML, ICLR, KDD, RecSys, EMNLP, ACL, or equivalent venues preferred)