About the job
In this role, you will define and drive the technical vision for long-horizon, supervised agentic systems at Zillow. You will design agent architectures, memory and supervision frameworks, and learning strategies that allow our agents to build and maintain context-rich relationships over weeks and months, not just single sessions, while remaining trustworthy, compliant, and measurably helpful. You will leverage Zillow’s uniquely rich domain datasets—such as interaction logs, behavioral signals, and structured real-estate and transaction data—to train and adapt agents that make high-quality, human-caliber recommendations at scale.
Responsibilities
Own the end-to-end research and technical strategy for long-horizon agentic experiences across shopping, financing, and professional workflows, in close partnership with the Agentic AI, data platform, and product teams.
Design and advance LLM post-training and evaluation methods (e.g., SFT, preference learning, RLHF/RLAIF, long-context modeling) tailored to supervised, high-stakes journeys in a complex, regulated domain.
Architect systems that combine persistent memory, tool use, and multi-agent collaboration to deliver consistent, context-rich guidance over long timelines.
Translate Zillow’s heterogeneous data (text, voice, behavioral, and structured real-estate/transaction data) into agent-ready knowledge and signals, in partnership with data and platform teams.
Collaborate with product and design to define success metrics, evaluation frameworks, and experiment plans for agentic experiences, including human-in-the-loop supervision and safety reviews.
Operate as a senior IC with the option to lead a small pod (up to ~5 scientists/engineers) focused on long-horizon agentic systems, mentoring principal-level talent and setting a high technical bar.
Represent Zillow’s work externally through publications, talks, open-source contributions, and thoughtful engagement with the research community, helping position Zillow as a destination for top agentic AI talent.
Qualifications
Minimum
A PhD in Computer Science, Electrical Engineering, or a related field—or equivalent experience—with emphasis in areas such as foundational LLMs, agentic AI, reinforcement learning, AI planning, or natural language processing.
10+ years of hands-on experience building and deploying large-scale AI systems, including at least several years focused on agent-based systems, multi-agent collaboration, or long-horizon conversational assistants.
Deep, current expertise in generative and agentic AI, including multimodal foundation models, transformers, advanced reasoning models, and post-training techniques (SFT, DPO, RLHF/RLAIF, preference learning, etc.).
A track record of leading ambiguous, cross-functional initiatives from concept to production—framing the problem, shaping data strategy, designing models and evaluation, and iterating based on live metrics and user feedback.
Demonstrated impact in the research community through publications at top venues (e.g., NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR) and/or widely used open-source contributions in LLMs, agentic frameworks, or related areas.
Experience designing evaluation and supervision frameworks for long-horizon agents, ideally in regulated or high-stakes domains such as finance, healthcare, or large-scale marketplaces, with an emphasis on safety, fairness, and trust.
Strong technical leadership skills: you mentor senior scientists and engineers, create clarity amid ambiguity, and build alignment across research, engineering, product, and leadership stakeholders.
Excellent communication skills with the ability to distill complex ideas into clear narratives for executives, cross-functional partners, and external audiences.
Preferred
Strong technical leadership skills: you mentor senior scientists and engineers, create clarity amid ambiguity, and build alignment across research, engineering, product, and leadership stakeholders.
Excellent communication skills with the ability to distill complex ideas into clear narratives for executives, cross-functional partners, and external audiences.