About the job
We’re looking for a Product Manager for Voice: real-time, human-quality AI conversations. Voice is one of the most demanding and important surfaces for AI agents. It requires low latency, high reliability, natural turn-taking, and the ability to handle messy, real-world interactions across phone systems and global customers. As PM for Voice, you will define how Sierra agents sound, respond, and behave in live conversations. You’ll shape the core voice experience—from first utterance → dialogue → resolution—and ensure agents perform reliably in production across telephony and real-time systems. This is a zero-to-one and scaling role at the intersection of speech, infrastructure, and product experience.
Responsibilities
Define the voice interaction model - Shape how agents handle real-time conversations—turn-taking, interruptions, latency, tone, and recovery from errors. Design what “human-quality” voice interaction actually means in practice.
Build reliable real-time systems - Work closely with engineering on streaming architectures, latency budgets, and failure handling. Voice is unforgiving—ensure agents respond quickly and consistently in production environments.
Own the voice stack experience - Partner across ASR, TTS, LLMs, and telephony integrations to deliver a cohesive product. Help decide model choices, orchestration strategies, and how different components work together.
Make voice measurable and improvable - Define how we evaluate voice agents: latency, interruption handling, resolution rate, and conversation quality. Build feedback loops that improve performance over time.
Translate real-world usage into product direction - Work closely with customers deploying voice agents in production. Understand edge cases (noisy environments, accents, call flows) and turn them into product improvements.
Qualifications
Minimum
3+ years of product management experience, with meaningful exposure to real-time systems, voice, or AI products
Experience shipping voice or real-time products - You understand the constraints of latency, streaming systems, and user expectations in synchronous interactions
Strong technical depth - Ability to engage deeply with engineers on system design (e.g., speech pipelines, streaming infra, telephony systems, reliability tradeoffs)
Experience working with AI systems - Familiarity with LLMs, speech-to-text, or text-to-speech systems and their limitations in production environments
Track record of 0→1 product development - Comfortable operating in ambiguous spaces and iterating quickly to reach product-market fit
Preferred
No preferred qualifications listed.