Intent-Aware Dialogue Generation and Multi-Task Contrastive Learning for Multi-Turn Intent Classification

📅 2024-11-21
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
High-quality, multilingual, domain-specific dialogue data for multi-turn intent classification remains scarce. Method: This paper proposes the Chain-of-Intent generation mechanism—a novel LLM–HMM collaborative framework integrating self-play and intent-transition modeling to enhance dialogue coherence and domain fidelity—and the MINT-CL classification framework, a multitask contrastive learner incorporating domain-knowledge distillation and requiring minimal labeled data. Contribution/Results: We release MINT-E, the first open-source multilingual e-commerce multi-turn intent dialogue corpus. Experiments demonstrate significant improvements in multilingual intent classification accuracy, a 3.2× increase in data generation efficiency, and superior dialogue naturalness and intent consistency over baselines.

Technology Category

Application Category

📝 Abstract
Generating large-scale, domain-specific, multilingual multi-turn dialogue datasets remains a significant hurdle for training effective Multi-Turn Intent Classification models in chatbot systems. In this paper, we introduce Chain-of-Intent, a novel mechanism that combines Hidden Markov Models with Large Language Models (LLMs) to generate contextually aware, intent-driven conversations through self-play. By extracting domain-specific knowledge from e-commerce chat logs, we estimate conversation turns and intent transitions, which guide the generation of coherent dialogues. Leveraging LLMs to enhance emission probabilities, our approach produces natural and contextually consistent questions and answers. We also propose MINT-CL, a framework for multi-turn intent classification using multi-task contrastive learning, improving classification accuracy without the need for extensive annotated data. Evaluations show that our methods outperform baselines in dialogue quality and intent classification accuracy, especially in multilingual settings, while significantly reducing data generation efforts. Furthermore, we release MINT-E, a multilingual, intent-aware multi-turn e-commerce dialogue corpus to support future research in this area.
Problem

Research questions and friction points this paper is trying to address.

Generating large-scale multilingual dialogue datasets for intent classification
Modeling intent transitions and context-aware dialogue generation
Reducing annotation dependence for multi-turn intent classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines HMMs and LLMs for dialogue generation
Uses contrastive learning for intent classification
Generates multilingual intent-driven conversation datasets
🔎 Similar Papers
No similar papers found.