AIML - Sr Machine Learning Engineer, Data and ML Innovation

About the job

As a Senior Machine Learning Engineer, you will join end-to-end development of large language models and agentic systems, from training pipelines to evaluation frameworks and production deployment. You will work at the intersection of modeling, infrastructure, and product, helping push model quality through systematic experimentation and iteration. You’ll collaborate closely with research, infrastructure, and product teams to design robust training pipelines, build agent environments, and ship high-impact AI capabilities into real-world applications. This role blends deep modeling expertise with strong engineering fundamentals and offers the opportunity to shape both the technical direction and the ML platform powering Apple products.

Responsibilities

Model Training & Optimization

Design and implement large-scale LLM pretraining and post-training pipelines, including supervised fine-tuning, preference optimization, and continual learning.

Drive model hillclimbing through disciplined experimentation: dataset curation, hyperparameter tuning, and ablation studies.

Work on scalable training workflows using distributed frameworks.

Evaluation, Reward, and Data Systems

Develop evaluation frameworks for both offline benchmarks and online metrics, covering reasoning, tool use, and task success.

Design and maintain verifiers / rubric-based reward systems for agentic tasks and model alignment.

Build data pipelines for data generation, filtering, labeling, and replay buffers.

Agent & Environment Infrastructure

Build and maintain agent training environments, including tool APIs, simulators, and sandboxed runtimes.

Implement environment abstractions to support reinforcement learning and agent evaluation at scale.

Collaborate on large scale RL-infra: RL-trainer, rollout system, and containerized environments.

Qualifications

Minimum

5+ years of hands on ML engineering experiences, with at least 1+ years working directly on large language models or generative AI.

Bachelor’s, Master’s, or PhD in Computer Science, Machine Learning, or a related technical field — or equivalent practical experience.

Hands-on experience with LLM training workflows, including one or more of: Pretraining or continued pretraining, Supervised fine-tuning (SFT), Preference optimization (e.g., RLHF, DPO, PPO).

Strong software engineering fundamentals: debugging, testing, code reviews, and production reliability.

Demonstrated publication records in relevant conferences (e.g., NeurIPS, ICML, ICLR, etc.).

Preferred

Direct experience with agentic systems, including tool use, environment design, or reinforcement learning.

Experience with building or operating training environments or simulators (gym-style, tool-based, or sandboxed environments).

Experience with model hillclimbing workflows: systematic experimentation, ablations, dataset iteration, and continuous quality improvement.

Ability to work across research and engineering boundaries, turning ideas into scalable systems.

Have demonstrated creative and critical thinking with an innate drive to improve how things work. Have a high tolerance for ambiguity.