PyTOD: Programmable Task-Oriented Dialogue with Execution Feedback

📅 2025-08-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the insufficient accuracy of dialogue state tracking (DST) in task-oriented dialogue (TOD). We propose an execution-aware state tracking framework that abandons conventional grammar-based API invocation paradigms. Instead, it leverages large language models to generate executable Python code for dynamic user goal modeling, with constrained decoding ensuring syntactic and semantic validity. An execution feedback mechanism iteratively aligns the dialogue state by comparing code execution outputs against contextual dialogue history, enabling fine-grained error detection and self-correction. Evaluated on the SGD benchmark, our approach significantly outperforms existing state-of-the-art methods, achieving substantial improvements in joint intent-slot accuracy and cross-domain robustness. These results empirically validate the feasibility and effectiveness of “code-as-state-representation” for DST.

Technology Category

Application Category

📝 Abstract
Programmable task-oriented dialogue (TOD) agents enable language models to follow structured dialogue policies, but their effectiveness hinges on accurate state tracking. We present PyTOD, an agent that generates executable code to track dialogue state and uses policy and execution feedback for efficient error correction. To this end, PyTOD employs a simple constrained decoding approach, using a language model instead of grammar rules to follow API schemata. This leads to state-of-the-art state tracking performance on the challenging SGD benchmark. Our experiments show that PyTOD surpasses strong baselines in both accuracy and robust user goal estimation as the dialogue progresses, demonstrating the effectiveness of execution-aware state tracking.
Problem

Research questions and friction points this paper is trying to address.

Improving dialogue state tracking accuracy in programmable agents
Generating executable code for efficient error correction
Enhancing robust user goal estimation during dialogues
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates executable code for state tracking
Uses policy and execution feedback for correction
Employs constrained decoding with language model
🔎 Similar Papers
No similar papers found.