Machine Learning Researcher, Foundation Models [SWE Org]

About the job

We build frontier foundation models that power intelligent experiences at Apple. Our team works across the full training lifecycle: including pre-training foundation models, and developing mid-training approaches that bridge general capability and task-specific performance. What makes our work distinct is that we're engineering models specifically for Apple silicon and optimized for experiences that are private, personal, and deeply integrated into the OS. We're solving frontier problems in reward modeling to resist reward hacking, handling sparse and delayed rewards in agentic settings, and aligning models reliably across the spectrum from open-ended creative tasks to precise, action-taking workflows. If you're drawn to hard problems where the research and the product are inseparable, this is the team.

Responsibilities

In this role, you will focus on pretraining, large language model (LLM) architecture, and scientific scaling of LLM. Experiences on full-stack LLM optimization such as mid-training, reinforcement learning, data research and kernel optimization (e.g. pallas and triton) will be a plus.

Qualifications

Minimum

Demonstrated expertise in deep learning with publication record in relevant conferences (e.g., NeurIPS, ICML, ICLR, COLM, ACL, NAACL, EMNLP, ACL) or a track record in applying deep learning techniques to products

Proficient programming skills in Python and one of the deep learning toolkits such as JAX, PyTorch, or Tensorflow

Ability to work in a collaborative environment.

PhD, or equivalent practical experience, in Computer Science, or related technical field.

Preferred

Code large language models.

Reinforcement learning, on-policy distillation.

Post-training, mid-training large language models.

LLM context lengthening.