Can xLLMs Understand the Structure of Dialog? Exploring Multilingual Response Generation in Complex Scenarios

📅 2025-01-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (xLLMs) exhibit structural understanding bottlenecks in multilingual, multiparty dialogue, yet existing benchmarks lack realistic complexity and multilingual parallelism. Method: We introduce XMP—a high-quality, multilingual parallel dialogue benchmark derived from authentic multiparty podcasts—featuring ≥3 participants per sample, covering sociocultural and political topics, with fine-grained dialogue structure annotations and cross-lingual consistency evaluation. Contribution/Results: Our empirical analysis reveals critical deficiencies: xLLMs achieve only 52% role-tracking accuracy and suffer a 37% drop in response coherence across languages, challenging the prevailing “multilingual complementarity” hypothesis. We propose a novel paradigm for modeling complex dialogue grounded in real-world podcast data, supported by controlled generation experiments and mechanistic analysis. The XMP dataset and evaluation framework are publicly released to advance standardized assessment of multilingual multiparty dialogue understanding.

Technology Category

Application Category

📝 Abstract
Multilingual research has garnered increasing attention, especially in the domain of dialogue systems. The rapid advancements in large language models (LLMs) have fueled the demand for high-performing multilingual models. However, two major challenges persist: the scarcity of high-quality multilingual datasets and the limited complexity of existing datasets in capturing realistic dialogue scenarios. To address these gaps, we introduce XMP, a high-quality parallel Multilingual dataset sourced from Multi-party Podcast dialogues. Each sample in the dataset features at least three participants discussing a wide range of topics, including society, culture, politics, and entertainment.Through extensive experiments, we uncover significant limitations in previously recognized multilingual capabilities of LLMs when applied to such complex dialogue scenarios. For instance, the widely accepted multilingual complementary ability of LLMs is notably impacted. By conducting further experiments, we explore the mechanisms of LLMs in multilingual environments from multiple perspectives, shedding new light on their performance in real-world, diverse conversational contexts.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Multilingual Dialogue
Cross-lingual Understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilingual Dataset
Complex Dialogue
Large Language Model Performance
🔎 Similar Papers
No similar papers found.
Z
Zhongtian Hu
School of Computer Science and Engineering, Northwestern Polytechnical University
Y
Yiwen Cui
School of Computer Science and Engineering, Northwestern Polytechnical University
Ronghan Li
Ronghan Li
Xidian University
Natural language processingMachine Reading ComprehensionDialogue System
M
Meng Zhao
School of Artificial Intelligence and Big Data, Henan University of Technology
L
Lifang Wang
School of Computer Science and Engineering, Northwestern Polytechnical University