About the job
Our team builds ML-inference applications and services on Apple Silicon in the datacenter, specifically focusing in recent years on generative AI as part of the Private Cloud Compute component of Apple Intelligence.
Responsibilities
integrate inference code into a full service stack to ensure that user traffic is served reliably and performantly, and will have a strong focus on developing code that is easy and safe to develop, update and monitor
Qualifications
Minimum
Experience working as a software engineer on large production systems
Experience programming in: Swift, C, C++, Python, iOS/macOS or XCode
Practical experience running machine learning models and evaluating them for quality and performance metrics
Preferred
Familiarity with Apple ML stack (ANE, CoreML, MPS/Metal), high-level general distributed ML stack (PyTorch-distributed, NCCL) and high throughput inter-chip communication systems.
On-device iOS development
Quality focus - produce reliable, maintainable, deliverable software - Comfortable diving deep - working across multiple levels of abstraction - Good at handling relationships & communication - collaborate well with colleagues across a wide range of functions