AIML - Machine Learning Researcher, DMLI- Image/Video Generation

Apple
Santa Clara, United States of America2026-03-31

About the job

We are hiring a researcher with a strong technical background in Image/Video generation and editing, as well as Multimodal Foundation Models. You will play a critical role in the research and development of multimodal foundation models for image/video/3D generation, editing, animation, and many more. As a member of the team, you will have the opportunity to develop fundamental model capabilities, collaborate with team members with diverse backgrounds to work on ambitious projects, and collaborate broadly across Apple with world-class engineers and researchers to advance our products and delight millions of users.

Responsibilities

Developing, fine-tuning, and evaluating foundational image generation and image editing models, as well as unified multimodal foundation models capable of both visual understanding and generation.

Developing, fine-tuning, and evaluating domain-specific image generation and editing models for various tasks and applications in Apple’s AI-powered products.

Conducting innovative research and transferring pioneering research in generative AI to production-ready technologies.

Understanding product requirements, translating them into modeling tasks and engineering tasks.

Qualifications

Minimum

PhD, MS or equivalent experience

Experience in machine learning, deep learning and statistical modeling.

Experience in developing models for computer vision tasks, such as object detection, visual question answering.

Experience in image generation models, such as VAE, GAN, and diffusion models

Proficiency in one of the following deep learning frameworks: PyTorch, Jax, Tensorflow

Proficiency in one of following languages: Python, Go, Java, C++

Preferred

Experience in developing state-of-the-art image generation/editing models.

Good interpersonal skills and team player.