About the job
This role is part of Uber’s ML Serving team within the AI Platform, responsible for defining and evolving the infrastructure that powers real-time ML and generative AI inference at Uber scale. As a Staff Software Engineer, you will set technical direction for ML serving systems, lead cross-team initiatives, and design foundational architectures that support thousands of models in production. Your work will shape Uber’s long-term strategy for scalable, reliable, and efficient ML serving.
Responsibilities
Define architecture and technical strategy for Uber’s ML serving and inference platforms
Lead cross-team efforts to scale and evolve serving infrastructure for predictive and generative AI workloads
Design systems that balance latency, cost, reliability, and developer productivity
Act as a technical leader and mentor across the ML Platform organization
Drive operational excellence and long-term sustainability of mission-critical ML systems
Qualifications
Minimum
BS or MS in Computer Science or a related technical discipline, or equivalent experience
8+ years of full-time engineering experience
Extensive experience designing and operating large-scale distributed systems in production
Deep expertise in backend systems, system architecture, and performance optimization
Strong leadership skills with a track record of driving complex technical initiatives
Preferred
Deep experience with ML serving platforms, inference orchestration, or real-time AI systems
Experience supporting high-throughput, low-latency workloads at global scale
Strong understanding of ML model lifecycle, observability, and reliability at scale
Proven ability to influence technical direction across multiple teams and stakeholders