About the job
In this role, you will serve as the Uber Technical Lead (UTL) for Observability Intelligence, driving strategic initiatives to pivot SRE incident response toward an AI-driven paradigm, at a pivotal moment of Google's monitoring systems undergo a generational evolution. You will be a part of a transformative shift away from a disjointed collection of isolated tools into a cohesive, "Northstar" observability ecosystem. As a part of this role, we are seeking a leader with a proven history of managing business-critical domains, possessing the expertise to navigate architectural trade-offs between urgent product requirements and long-term technical durability.
Responsibilities
Drive technical project strategy, lead large-scale ML infrastructure optimization, and oversee the design and implementation of solutions across multiple specialized ML areas.
Define and socialize a cohesive "Observability Intelligence" strategy that aligns with the broader Monitoring Northstar, ensuring we build shared technical concerns once and solve them for the entire organization.
Represent the Observability Intelligence organization in high-stakes technical reviews and collaborate across organizational boundaries (AlertManager, AI Operations, Incident Response Management, and Site Reliability Engineering teams across all Product Areas) to drive consensus on critical observability standards
Act as the primary technical partner to Product Management, translating broad product "Whats" into scalable architectural "Hows."
Lead high-level design reviews that ensure technical consistency across the stack, prioritizing interoperability, reusability, and semantic cohesion.
Qualifications
Minimum
Bachelor’s degree or equivalent practical experience.
8 years of experience in software development.
7 years of experience managing technical projects, ML design, and working with industry ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging, fine tuning).
5 years of experience with one or more of the following: Speech/audio (e.g., technology duplicating and responding to the human voice), reinforcement learning (e.g., sequential decision making), ML infrastructure, or specialization in another ML field.
5 years of experience with design and architecture; and testing/launching software products.
Preferred
Master’s degree or PhD in Engineering, Computer Science, or a related technical field.
8 years of experience with data structures and algorithms.
5 years of experience in a technical leadership role leading project teams and setting technical direction.
3 years of experience working in a complex, matrixed organization involving cross-functional, or cross-business projects.
Familiarity with and interest in the current AI landscape (Large Language Model (LLMs), generative agents, etc).