About the job
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive ML Engineer with a strong machine learning background, to build customization capabilities such as fine tuning and distillation. As a ML engineer with the AGI team, you will be responsible for leading the development of novel LLM training techniques and optimizations to advance the state of LLMs. You will collaborate closely with Applied Scientists and leverage Amazon’s heterogeneous data sources and large-scale computing resources to accelerate development of multimodal Large Language Models and Generative Artificial Intelligence solutions.
Responsibilities
Will work with other team engineers to investigate design approaches, prototype new technology and evaluate technical feasibility.
Work closely with Applied scientists to process data, scale machine learning models while optimizing
Will work in an Agile/Scrum environment to deliver high quality software.
Qualifications
Minimum
3+ years of non-internship professional software development experience
Experience in developing and deploying LLMs in production on GPUs, Neuron, TPU or other AI acceleration hardware
2+ years of non-intternship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
Knowledge of machine learning model architecture and inference
3+ years of designing and developing large-scale, multi-tiered, multi-threaded, embedded or distributed software applications, tools, systems, and services using: C#, C++, Java, or Perl experience
Bachelor's degree or foreign equivalent in Computer Science, Engineering, Mathematics, or a related field
Knowledge of ML frameworks including JAX, PyTorch, vLLM, SGLang, Dynamo, TorchXLA, and TensorRT
Preferred
A commitment to team work, hustle, and strong communication skills (to both business and technical partners) are absolute requirements. Creating reliable, scalable, and high performance AI products requires exceptional technical expertise, a sound understanding of the fundamentals of Computer Science and Machine Learning. This person has thrived and succeeded in delivering high quality technology products/services in a hyper-growth environment.