About the job
The Artificial General Intelligence (AGI) Customization Team is seeking a highly skilled and experienced Applied Scientist to support adoption and enable customization of Amazon Nova. The role focuses on developing state-of-the-art services and tools for model customization, including supervised fine-tuning, reinforcement learning, and knowledge distillation across large language models. As an Applied Scientist, you will play a important role in developing advanced customization capabilities that enable enterprises to build highly performant application-specific models without the need for training models from scratch. Your work will directly impact how companies leverage Amazon Nova models for their specific use cases.
Responsibilities
- Contribute to the development of novel customization techniques including extended post-training, continued pre-training, and advanced knowledge distillation
- Collaborate with cross-functional teams to design and implement enterprise-ready tooling for various training techniques on Amazon SageMaker
- Design and execute experiments to optimize model accuracy, latency, and cost across different customization approaches (SFT, DPO, PPO)
- Develop and enhance preference learning algorithms and training curricula for customer-specific applications
- Create robust evaluation frameworks for assessing model performance across different domains and use cases
- Contribute to the development of the Responsible AI toolkit, including creating training and evaluation datasets for model alignment
- Design and implement secure access mechanisms for early model checkpoints and weights
- Communicate technical insights and results to both technical and non-technical stakeholders through presentations and documentation
Qualifications
Minimum
- 3+ years of building models for business application experience
- PhD, or Master's degree and 4+ years of CS, CE, ML or related field experience
- Experience in patents or publications at top-tier peer-reviewed conferences or journals
- Experience programming in Java, C++, Python or related language
- Experience in any of the following areas: algorithms and data structures, parsing, numerical optimization, data mining, parallel and distributed computing, high-performance computing
- 1+ years of building machine learning models for business application experience
- Master's degree, or PhD and 2+ years of applied research experience
- Experience with any programming language such as Python, Java, C++
- Experience in state-of-the-art deep learning models architecture design and deep learning training and optimization and model pruning
Preferred
- Experience using Unix/Linux
- Experience in professional software development
- PhD in computer science, machine learning, engineering, or related fields, or Master's degree
- PhD in computer science, computer engineering, or related field, or experience with Machine and Deep Learning toolkits such as MXNet, TensorFlow, Caffe and PyTorch
- Experience that includes strong analytical skills, attention to detail, and effective communication abilities, or experience in software development and experience in managing and troubleshooting network
- Experience collaborating with cross-functional teams
- Experience in developing and implementing algorithms and models for supervised fine-tuning and reinforcement learning
- Experience with patents or publications at top-tier peer-reviewed conferences or journals