Modeling the One-to-Many Property in Open-Domain Dialogue with LLMs

📅 2025-06-18

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Open-domain dialogue exhibits a one-to-many (O2M) response property—where multiple valid, high-quality responses exist for a single input—yet mainstream LLM-based dialogue systems fail to explicitly model this characteristic. To address this, we propose a two-stage O2M dialogue framework: (1) generating diverse, high-quality candidate responses, and (2) selecting the optimal response based on human preference signals. This work is the first to decouple O2M modeling into multi-response generation and preference-based selection. We introduce o2mDial, the first dialogue dataset explicitly designed for O2M modeling. Additionally, we develop tailored in-context learning and instruction-tuning strategies, along with a novel Multi-Response Generation (MRG) evaluation metric. Experiments demonstrate that our approach significantly improves response diversity and contextual consistency—especially for smaller models—achieving up to a 90% gain in response quality and approaching the performance of much larger models.

Technology Category

Application Category

📝 Abstract

Open-domain Dialogue (OD) exhibits a one-to-many (o2m) property, whereby multiple appropriate responses exist for a single dialogue context. Despite prior research showing that modeling this property boosts response diversity, most modern LLM-based dialogue agents do not explicitly do so. In this work, we model the o2m property of OD in LLMs by decomposing OD generation into two key tasks: Multi-Response Generation (MRG) and Preference-based Selection (PS), which entail generating a set of n semantically and lexically diverse high-quality responses for a given dialogue context, followed by selecting a single response based on human preference, respectively. To facilitate MRG and PS, we introduce o2mDial, a dialogue corpus explicitly designed to capture the o2m property by featuring multiple plausible responses for each context. Leveraging o2mDial, we propose new in-context learning and instruction-tuning strategies, as well as novel evaluation metrics for MRG, alongside a model-based approach for PS. Empirical results demonstrate that applying the proposed two-stage framework to smaller LLMs for OD generation enhances overall response diversity while maintaining contextual coherence, improving response quality by up to 90%, bringing them closer to the performance of larger models.

Problem

Research questions and friction points this paper is trying to address.

Modeling one-to-many property in open-domain dialogue with LLMs

Generating diverse high-quality responses for dialogue contexts

Selecting preferred responses based on human preferences

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decompose dialogue into MRG and PS tasks

Introduce o2mDial corpus for diverse responses

Propose in-context learning and evaluation metrics

🔎 Similar Papers

A Survey on Recent Advances in LLM-Based Multi-turn Dialogue Systems