BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response Generation

📅 2025-01-17
🏛️ SIGDIAL Conferences
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Standard language modeling loss in open-domain dialogue often fails to capture the core semantic content of responses. To address this, we propose a keyword-level Bag-of-Keywords (BoK) auxiliary loss—the first to integrate keyword prediction (rather than conventional bag-of-words modeling) into dialogue generation objectives—enabling joint optimization of semantic focus and posterior interpretability while supporting reference-free evaluation. Our method introduces a differentiable BoK loss function, jointly optimized with the primary language modeling loss via a weighted sum; it is architecture-agnostic, compatible with both encoder-decoder (e.g., T5) and decoder-only (e.g., DialoGPT) models. Experiments on DailyDialog and Persona-Chat demonstrate significant improvements in response quality. Moreover, the BoK-based language model (BoK-LM) serves as a highly effective reference-free evaluation metric, achieving state-of-the-art correlation with human judgments while offering intrinsic interpretability—establishing a novel paradigm for dialogue generation.

Technology Category

Application Category

📝 Abstract
The standard language modeling (LM) loss by itself has been shown to be inadequate for effective dialogue modeling. As a result, various training approaches, such as auxiliary loss functions and leveraging human feedback, are being adopted to enrich open-domain dialogue systems. One such auxiliary loss function is Bag-of-Words (BoW) loss, defined as the cross-entropy loss for predicting all the words/tokens of the next utterance. In this work, we propose a novel auxiliary loss named Bag-of-Keywords (BoK) loss to capture the central thought of the response through keyword prediction and leverage it to enhance the generation of meaningful and interpretable responses in open-domain dialogue systems. BoK loss upgrades the BoW loss by predicting only the keywords or critical words/tokens of the next utterance, intending to estimate the core idea rather than the entire response. We incorporate BoK loss in both encoder-decoder (T5) and decoder-only (DialoGPT) architecture and train the models to minimize the weighted sum of BoK and LM (BoK-LM) loss. We perform our experiments on two popular open-domain dialogue datasets, DailyDialog and Persona-Chat. We show that the inclusion of BoK loss improves the dialogue generation of backbone models while also enabling post-hoc interpretability. We also study the effectiveness of BoK-LM loss as a reference-free metric and observe comparable performance to the state-of-the-art metrics on various dialogue evaluation datasets.
Problem

Research questions and friction points this paper is trying to address.

Dialogue Systems
Response Quality
Key Point Understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

BoK Loss
Dialogue Quality Improvement
Interpretability Enhancement
🔎 Similar Papers
No similar papers found.