DuIVRS-2: An LLM-based Interactive Voice Response System for Large-scale POI Attribute Acquisition

πŸ“… 2026-05-18
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

203K/year
πŸ€– AI Summary
This work addresses the challenges of error propagation and high maintenance costs in traditional modular voice response systems for large-scale point-of-interest (POI) attribute collection. To overcome these limitations, the authors propose an end-to-end dialogue system powered by a large language model. The approach leverages finite state machine (FSM)-guided data augmentation to mitigate long-tailed data distributions, integrates chain-of-thought (CoT) reasoning with selective generation to suppress hallucinations, and introduces a dual-evaluator collaborative iterative learning mechanism for continuous policy optimization with minimal human intervention. Deployed in production, the system handles approximately 400,000 calls per day, achieving a task success rate of 83.9%β€”a 4-percentage-point improvement over the previous systemβ€”with an average response latency of only 130 milliseconds.
πŸ“ Abstract
Accurate Point of Interest (POI) attribute acquisition is essential for location-based services, yet traditional modular Interactive Voice Response (IVR) systems suffer from error accumulation and high maintenance overhead. We present DuIVRS-2, a large language model (LLM)-based end-to-end framework designed for large-scale POI attribute acquisition at Baidu Maps. To address the long-tail distribution of real-world interactions, our methodology first employs a finite state machine (FSM)-guided data augmentation strategy to synthesize a balanced and diverse training dataset. We then streamline dialogue management via a selective generation scheme combined with a Chain-of-Thought (CoT) mechanism, which ensures output stability and effectively eliminates hallucinations in industrial settings. To facilitate continuous policy refinement with minimal manual effort, we design a cooperative iterative learning framework that leverages a dual-evaluator voting system. Deployed in production for two months, DuIVRS-2 processed 0.4 million calls daily and achieved a 83.9\% Task Success Rate (TSR), outperforming its predecessor by 4 percentage points while maintaining a low reaction time of 130ms. This work provides a production-proven reference for developing robust, cost-effective LLM agents for large-scale industrial dialogue applications.
Problem

Research questions and friction points this paper is trying to address.

Interactive Voice Response
POI attribute acquisition
error accumulation
maintenance overhead
large-scale
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based IVR
FSM-guided data augmentation
Chain-of-Thought reasoning
cooperative iterative learning
hallucination mitigation
πŸ”Ž Similar Papers