About the job
Generative artificial intelligence is reshaping how we serve clients and run the firm. In the Chief Data and Analytics Office, you will lead the delivery of enterprise-grade generative artificial intelligence products and platforms with strong governance and controls. You will partner across machine learning, cloud engineering, and site reliability engineering to ship resilient solutions with clear return on investment. This is a hands-on leadership role for someone who enjoys building at scale and operating in real production environments.
Responsibilities
Lead the design and delivery of production generative artificial intelligence products and reusable backend application programming interfaces for firmwide adoption
Architect scalable systems that combine large enterprise datasets with large language and multimodal models
Set technical direction for model-enabled services, including quality, latency, throughput, and cost targets
Partner with cloud engineering and site reliability engineering teams to deliver resilient architectures, observability, and operational readiness
Drive translation of research concepts into production-ready capabilities through evaluation, iteration, and hardening
Establish engineering standards for reliability, security, and responsible artificial intelligence controls across the product lifecycle
Own delivery planning and execution, including risks, dependencies, and stakeholder communication
Define and manage objectives and key results aligned to business outcomes, adoption, and return on investment
Mentor and develop engineers through coaching, technical reviews, and role modeling best practices
Troubleshoot critical production issues, lead root-cause analysis, and implement long-term preventative improvements
Qualifications
Minimum
PhD in a quantitative discipline such as Computer Science, Mathematics, or Statistics, or equivalent practical experience
7+ years of experience in machine learning engineering and/or applied software engineering delivering production systems
3+ years of technical leadership experience, including leading delivery for complex cross-functional initiatives
Demonstrated experience owning enterprise machine learning services, including reliability, incident management, and service-level outcomes
Strong fundamentals in statistics, optimization, and machine learning theory with applied expertise in natural language processing and/or computer vision
Hands-on experience implementing distributed, multi-threaded, scalable systems (for example Ray, Horovod, or DeepSpeed)
Proven ability to design and scale service-oriented architectures and application programming interfaces with high availability and performance requirements
Experience defining success metrics and writing clear objectives and key results aligned to business expectations
Strong judgment to align technical decisions with governance, risk, and control requirements for responsible artificial intelligence
Excellent communication and stakeholder management skills, with ability to influence across senior technical and business audiences
Preferred
Experience designing and implementing machine learning pipelines using directed acyclic graph frameworks (for example Kubeflow, DVC, or Ray)
Experience building batch and streaming microservices exposed via gRPC and/or GraphQL
Demonstrable experience with parameter-efficient fine-tuning, quantization, and quantization-aware fine-tuning for large language models
Experience with multimodal large language model use cases (text plus image, speech, or video)
Experience with advanced prompting and reasoning approaches such as chain-of-thought, tree-of-thought, or graph-of-thought
Experience establishing evaluation frameworks and production monitoring for model quality, safety, and drift
Experience building reusable platforms that enable other teams to ship model-enabled products faster