Improving Dialogue State Tracking through Combinatorial Search for In-Context Examples

📅 2025-05-31

📈 Citations: 0

✨ Influential: 0

career value

149K/year

🤖 AI Summary

Existing retrieval methods for dialogue state tracking (DST) suffer from low exemplar selection efficiency, overlooking exemplar synergy, query-specific linguistic characteristics, and DST performance-oriented scoring optimization. Method: We propose a composition-effect-driven retrieval framework featuring: (i) the first end-to-end scoring mechanism explicitly modeling the joint impact of multi-exemplar combinations on DST performance; (ii) query-aware semantic matching and re-ranking; and (iii) a composition-aware search strategy to enhance retrieval precision. Results: On MultiWOZ, our method improves data efficiency by 20×; achieves significant gains in cross-domain generalization (evaluated on SGD); and, under ideal error-free retrieval, yields an absolute +12% improvement in DST joint goal accuracy—effectively pushing beyond prior performance ceilings.

Technology Category

Application Category

📝 Abstract

In dialogue state tracking (DST), in-context learning comprises a retriever that selects labeled dialogues as in-context examples and a DST model that uses these examples to infer the dialogue state of the query dialogue. Existing methods for constructing training data for retrievers suffer from three key limitations: (1) the synergistic effect of examples is not considered, (2) the linguistic characteristics of the query are not sufficiently factored in, and (3) scoring is not directly optimized for DST performance. Consequently, the retriever can fail to retrieve examples that would substantially improve DST performance. To address these issues, we present CombiSearch, a method that scores effective in-context examples based on their combinatorial impact on DST performance. Our evaluation on MultiWOZ shows that retrievers trained with CombiSearch surpass state-of-the-art models, achieving a 20x gain in data efficiency and generalizing well to the SGD dataset. Moreover, CombiSearch attains a 12% absolute improvement in the upper bound DST performance over traditional approaches when no retrieval errors are assumed. This significantly increases the headroom for practical DST performance while demonstrating that existing methods rely on suboptimal data for retriever training.

Problem

Research questions and friction points this paper is trying to address.

Existing retriever training ignores synergistic effects of examples

Current methods inadequately consider query linguistic characteristics

Retriever scoring not optimized for DST performance improvement

Innovation

Methods, ideas, or system contributions that make the work stand out.

CombiSearch scores examples by combinatorial impact

Optimizes retriever training for DST performance

Enhances data efficiency and generalization

🔎 Similar Papers

No similar papers found.