About the job
We are seeking an experienced GPU Performance Engineer with a strong background in Python and large-scale model training. In this role, you will design and implement improvements to our training infrastructure and directly contribute to technical decisions that optimize performance of our models. You will also work on post-training processes, including reinforcement learning and fine-tuning. Furthermore, you will contribute to improving the efficiency and scalability of our model serving infrastructure.
Responsibilities
No responsibilities listed.
Qualifications
Minimum
Strong engineering skills with fluency in Python and PyTorch (or other frameworks).
Proven experience implementing and training large deep learning models.
Experience writing and debugging low-level GPU code (CUDA, C++).
Experience scaling up GPU jobs using large-scale compute clusters (e.g., Slurm or Kubernetes).
Demonstrated ability to analyze and optimize the performance of GPU-accelerated workloads, including profiling, identifying bottlenecks, and implementing performance tuning techniques.
Preferred
No preferred qualifications listed.