About the job
We are seeking an experienced AI model optimization engineer with expertise in optimizing AI model training and inference, including distributed training/inference and acceleration. The ideal candidate will work at the cutting edge of AI efficiency, enhancing the performance, scalability, and deployment of large-scale generative AI models.
Responsibilities
Optimize large model training pipelines to improve efficiency, speed, and scalability.
Develop and improve distributed training strategies such as data parallelism, model parallelism, pipeline parallelism and communication to accelerate model training.
Benchmark and profile deep learning models to identify performance bottlenecks and optimize computational resources.
Qualifications
Minimum
Master’s or PhD in Computer Science, Electrical Engineering, Artificial Intelligence, or a related field.
2+ years of experience in AI model training optimization.
Strong software engineering skills, including proficiency in Python, C++, and CUDA.
Strong proficiency in deep learning frameworks such as PyTorch, Megatron and Deepspeed.
Experience with distributed training techniques such as data parallelism, model parallelism, and pipeline parallelism.
Knowledge of transformers and diffusion models.
Preferred
No preferred qualifications listed.