Data Scientist - Multimodal Training Data & Tools - SIML

Apple
Cupertino, United States of America2026-03-31

About the job

We are seeking an experienced data scientist to help us build and deploy large scale generative models. Responsibilities in the role will include training ad-hoc models for data synthesis, build data pipelines and tools for large-scale data auto-grading across text and image, prompt engineering and optimization, and extracting insights from billion-scale datasets to enable the model training.

Responsibilities

training ad-hoc models for data synthesis; build data pipelines and tools for large-scale data auto-grading across text and image; prompt engineering and optimization; extracting insights from billion-scale datasets to enable the model training

Qualifications

Minimum

Bachelors or Masters degree in Electrical Engineering/Computer Science or a related field (mathematics, physics or computer engineering), with a focus on data science; 3+ years of data science or related experience, preferably in a consumer tech company; Experience developing models for data synthesis and auto-grading to enable training generative models; Experience in prompt engineering and optimization for LLMs; Strong programming and problem-solving skills

Preferred

Strong problem-solving skills and ability to work in a collaborative, product-focused environment; Ability to communicate technical results clearly and concisely