About the job
Do you want to build the future of Apple-scale AI enabled observability? We're looking for an experienced full stack observability engineer to design and build cloud-native solutions that empower observability for Search, AIML Infrastructure, and Apple Intelligence products. We're at the forefront of inventing next-gen observability systems, blending cloud-first engineering, AI, and industry standards to deliver smart, scalable solutions. Your work will directly impact the experience of billions of users on their favorite Apple devices.
Responsibilities
You'll be a technical leader, collaborating with other senior engineers to design, develop and deploy cutting-edge observability solutions for our AI, Search & Knowledge products and infrastructure. You will also provide technical guidance, leverage AI pipelines, and mentor the team to deliver best-in-class solutions.
Qualifications
Minimum
7+ years software engineering experience and strong background in computer science: distributed systems, algorithms and data structures, APIs and highly-scalable, reliable systems and micro-services
Strong coding skills in Go, Javascript, Java, Python
Demonstrated experience in designing and building large scale enterprise observability solutions for data collection and storage, visualization and incident management
Demonstrated experience in building visualization solutions and features with in-depth understanding of cloud-native visualization frameworks such as Grafana, Datadog
Experience in observability collection solutions using time series metrics, distributed traces, logs and profiles with deep understanding of cloud-native technologies such as OpenTelemetry, Prometheus and Jaeger
Demonstrated proficiency in AWS services such as EKS and native Kubernetes, storage such as S3, networking, database and observability / monitoring services
Excellent verbal and written communication skills with strong problem solving skills
Excellent interpersonal skills for collaborating across teams, stakeholders, and open source collaborators
Preferred
Experience in building micro-services using public cloud infrastructure
Proven experience in delivering well-architected, reliable, highly-scalable cloud-native distributed systems for data management, observability or analytics services
Building large-scale incident management, alert management and notification systems
Experience using Gen AI LLMs and ML models for AI compute and model observability
Active cloud-native open source project contributions
Proficiency using cloud-native software development tools including coding, CI/CD and testing frameworks