Machine Learning Engineer - Speech & Multimodal Language Modeling

Apple
Cupertino, United States of America2026-04-22

About the job

The Special Projects team at Apple is developing novel user-facing features that leverage the multimodal capabilities of state-of-the-art foundation language models. We are looking for a highly skilled Machine Learning Engineer to build and evaluate these experiences, with a specific focus on Multimodal and Speech Language Models. A successful candidate is experienced in evaluating complex foundation model-driven systems end-to-end, translating subjective product requirements into objective criteria, has strong statistical analysis skills, and has worked with Speech Language Models.

Responsibilities

Design and implement processes for evaluating and improving multi-modal generative models to meet end-to-end product requirements.

Work with Data Engineers to process large scale speech audio data for foundation model training

Fine-tune Large Language Models (LLMs) and Speech Language Models (SpeechLMs) to improve performance for specific use cases

Work closely with other ML Researchers to define evaluation criteria and methodology to systematically evaluate foundation models

Experimental design for testing models/systems under test

Conduct robust statistical analysis to identify model deficiencies and failure states

Qualifications

Minimum

Master’s degree in Computer Science or Machine Learning

2+ years of hands-on experience building and evaluating generative AI models

Proficiency in Python and ML frameworks (Pytorch or Tensorflow)

Preferred

PhD in Computer Science, Machine Learning, Statistics, or other STEM field

5+ years of hands-on experience with SpeechLMs or LLMs

Experience with large-scale audio data processing on distributed systems

Experience with prompt evaluation and optimization for generative AI models

Proficiency in training, fine-tuning, and evaluation of foundation models and frameworks

A track record of publications or technical presentations in Machine Learning journals or conferences

Excellent communication skills and cross-functional collaboration