Tech Lead Manager, Commerce AI, Quality and Evals

Google
Sunnyvale, CA, USA / Kirkland, WA, USA

About the job

Google Cloud’s mission is to make every business successful through AI by combining cutting-edge technology, infrastructure, and talent. AI/ML software engineers in Cloud bridge the gap between pioneering models and a massive product vehicle reaching billions. Our talent density and AI-powered tools drive rapid development, rooted in a culture of empowerment and a bias to action. In this role, you aren’t just building technology; you’re shaping the frontier of enterprise and driving the evolution of advanced models. Join the Cloud Applied AI team to build the operational backbone for modern retail with Gemini Enterprise for Customer Experience. Our mission is to embed Google’s foundational AI directly into retailer infrastructure, creating a 'flywheel effect' where search, sales, and support converge. We are building the Shopping Agent, a multimodal concierge (text, voice, visual) that acts as a full-stack sales and support expert for global enterprise brands like Macy’s and Home Depot. As the Tech Lead Manager for the Quality and Evals Pillar, you will lead a dedicated team of 12+ Software Engineers responsible for guaranteeing response safety, brand alignment, and exceptional AI performance at scale. You will advocate our transition into a proactive, 'Vertical-First' engineering organization, ensuring that every agent we launch is reliable, consistent, and demonstrably advanced to the competition.

Responsibilities

Lead, mentor, and scale a high-performing Quality and Evals team (12+ SWEs), overseeing specialized pods including retail vertical owners (e.g., apparel, home and garden, etc.), evals hill climbing. and Return on Investment (ROI) metric.

Enforce the 'Launch Bar' quality standards, managing the automated 'No-Regression' release gates, hermetic holdout datasets, and ensuring strict pass-rate thresholds for all release applicants.

Drive the 'Vertical-First' architectural strategy, moving the team away from custom, client-specific prompts to modular, generic architecture that instantly elevates baseline performance across entire retail verticals.

Orchestrate aspirational 'Hill Climbing' efforts to continuously improve core agent metrics, including search accuracy, action accuracy and expectation compliance.

Act as the strategic bridge between Core Engineering, Product Management, and Forward Deployed Engineers (FDEs), hosting bi-weekly 'State of Quality' syncs and demystifying the AI quality process for stakeholders.

Qualifications

Minimum

Bachelor’s degree or equivalent practical experience.

8 years of experience with software development in one or more programming languages (e.g., Python, C, C++, Java, JavaScript).

7 years of experience leading technical project strategy, ML design, and optimizing ML infrastructure (e.g., model deployment, model evaluation, data processing, debugging, fine tuning).

5 years of experience in a technical leadership role.

5 years of experience in a people management or team leadership role.

2 years of experience with GenAI techniques (e.g., LLMs, Multi-Modal, Large Vision Models) or with GenAI-related concepts (language modeling, computer vision).

Preferred

5 years of experience working in a complex, matrixed organization.

5 years of experience in engineering leadership, particularly within rapidly scaling enterprise SaaS or AI product teams.

Experience designing telemetry, observability, and data pipeline solutions to track real-time application metrics and user behavior.

Experience leveraging user simulation (e.g., Monte Carlo runs) and deterministic checks for complex AI evaluation.

Experience with prompt engineering, Retrieval-Augmented Generation (RAG) architectures, and AI agent orchestration/tool calling.

Familiarity with the commerce/retail tech ecosystem, including e-commerce conversion funnels, catalog ingestion, and search/discovery platforms.