Applied AI Scientist - Multimodal Intelligence

About the job

This position requires a highly motivated person who wants to help us bridge the gap between research advances and practical applications in generative AI and multimodal foundation models. You will be responsible for evaluating and adapting emerging research, conducting applied research experiments, and working with engineering teams to transform promising approaches into robust solutions, taking into account future hardware design and product needs. In addition, you will have an opportunity to engage and collaborate with several teams across Apple to deliver the best products.

Responsibilities

evaluating and adapting emerging research;conducting applied research experiments;working with engineering teams to transform promising approaches into robust solutions;taking into account future hardware design and product needs;engaging and collaborating with several teams across Apple to deliver the best products

Qualifications

Minimum

Experience in deep learning with demonstrated work in at least one area of multimodal systems (e.g. vision, language, video, etc.);Proficiency in Python and in a modern deep learning framework such as PyTorch or JAX;Experience with rapid prototyping, reproduction, and validation of research ideas;Master's or equivalent practical experience, in Computer Science, Computer Vision, Machine Learning, or related technical field

Preferred

PhD, or equivalent practical experience, in Computer Science, Machine Learning, or a related technical field;Demonstrated expertise in deep learning, with either: A publication record in relevant conferences (e.g., NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV,COLM, etc), or a strong track record of applying deep learning techniques to real-world products;Experience with foundation models (language or multimodal);Familiarity with large-scale data pipelines, including data curation, preprocessing, and efficient storage;Experience fine-tuning or optimizing large models for production deployment;Experience applying foundation models to build autonomous or semi-autonomous agents, including planning, task decomposition, and multi-step reasoning;Familiarity with privacy-preserving or on-device machine learning;Ability to work effectively in a multi-functional, collaborative environment