Multimodal Model Training and Inference Optimization Engineer

ByteDance
圣何塞2026-02-09研发

About the job

We are seeking an experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and inference, including distributed training/inference and acceleration. The ideal candidate will work at the cutting edge of AI efficiency, enhancing the performance, scalability, and deployment of large-scale generative AI models.

Responsibilities

- Optimize large model training pipelines to improve efficiency, speed, and scalability.

- Develop and improve distributed training strategies such as data parallelism, model parallelism, pipeline parallelism and communication to accelerate model training.

- Benchmark and profile deep learning models to identify performance bottlenecks and optimize computational resources.

Qualifications

Minimum

- M.S or PhD in Computer Science, Electrical Engineering, Artificial Intelligence, or a related field.

- Experience in AI model training optimization.

- Strong software engineering skills, including proficiency in Python, C++, and CUDA.

- Strong proficiency in deep learning frameworks such as PyTorch, Megatron and Deepspeed.

- Experience with distributed training techniques such as data parallelism, model parallelism, and pipeline parallelism.

- Knowledge of transformers and diffusion models.

Preferred

- Candidates with publications at conferences such as MLSys, NeurIPS, ICLR, or ICML are preferred.

- Strong communication and teamwork skills.

- Self-motivated and strong problem-solving skills.

- Ability to work collaboratively in multi-functional teams.

- Experienced in implementing and optimizing complex and performance-critical systems.