🤖 AI Summary
This study investigates the key factors governing in-context learning (ICL) efficacy of large language models (LLMs) for dialogue state tracking (DST). Addressing two critical components—exemplar selection and prompt design—we propose a sentence-embedding-based k-nearest-neighbor exemplar retrieval method to ensure both semantic relevance and diversity among demonstrations, and introduce a DST-optimized templated prompt structure. Evaluations on MultiWoZ 2.4 with OLMo-7B-Instruct, Mistral-7B-Instruct-v0.3, and Llama-3.2-3B-Instruct demonstrate that exemplar relevance, semantic diversity, and structured prompting significantly improve zero-shot DST performance. To our knowledge, this is the first systematic analysis identifying the decisive factors governing ICL effectiveness in DST. Our findings provide both theoretical insights and practical guidelines for developing efficient, reproducible zero-shot DST systems without fine-tuning.
📝 Abstract
This study explores the application of in-context learning (ICL) to the dialogue state tracking (DST) problem and investigates the factors that influence its effectiveness. We use a sentence embedding based k-nearest neighbour method to retrieve the suitable demonstrations for ICL. The selected demonstrations, along with the test samples, are structured within a template as input to the LLM. We then conduct a systematic study to analyse the impact of factors related to demonstration selection and prompt context on DST performance. This work is conducted using the MultiWoZ2.4 dataset and focuses primarily on the OLMo-7B-instruct, Mistral-7B-Instruct-v0.3, and Llama3.2-3B-Instruct models. Our findings provide several useful insights on in-context learning abilities of LLMs for dialogue state tracking.