Summary on The Multilingual Conversational Speech Language Model Challenge: Datasets, Tasks, Baselines, and Methods

📅 2025-09-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of constructing multilingual conversational speech-language models (SLLMs). Method: We organize the first benchmark competition dedicated to real-world multilingual conversational speech-language modeling, releasing a high-quality multilingual dialogue dataset spanning 10+ languages and 1,604 hours of authentic speech; defining two core tasks—end-to-end speech-language modeling and cross-lingual speech dialogue generation; and establishing a unified evaluation framework with strong baseline systems. Contribution/Results: The competition attracted 78 teams globally, yielding 489 valid submissions and 14 technical reports from teams across 13 countries. It establishes the first systematic benchmark for multilingual conversational SLLMs, fostering deep integration of automatic speech recognition, text-to-speech synthesis, cross-lingual transfer learning, and pre-trained speech models, and delivers a reproducible best-practice guide for the community.

Technology Category

Application Category

📝 Abstract
This paper summarizes the Interspeech2025 Multilingual Conversational Speech Language Model (MLC-SLM) challenge, which aims to advance the exploration of building effective multilingual conversational speech LLMs (SLLMs). We provide a detailed description of the task settings for the MLC-SLM challenge, the released real-world multilingual conversational speech dataset totaling approximately 1,604 hours, and the baseline systems for participants. The MLC-SLM challenge attracts 78 teams from 13 countries to participate, with 489 valid leaderboard results and 14 technical reports for the two tasks. We distill valuable insights on building multilingual conversational SLLMs based on submissions from participants, aiming to contribute to the advancement of the community.
Problem

Research questions and friction points this paper is trying to address.

Advancing multilingual conversational speech language models
Providing task settings and multilingual speech dataset
Establishing baseline systems for model evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilingual conversational speech dataset
Baseline systems for participants
Insights from 489 leaderboard results
🔎 Similar Papers
No similar papers found.
Bingshen Mu
Bingshen Mu
Northwestern Polytechnical University
Speech RecognitionSpeech Understanding
Pengcheng Guo
Pengcheng Guo
Northwestern Polytechnical University
Speech RecognitionMachine LearningDeep Learnining
Z
Zhaokai Sun
Audio, Speech and Language Processing Group (ASLP@NPU), School of Computer Science, Northwestern Polytechnical University, Xi’an, China
S
Shuai Wang
School of Intelligence Science and Technology, Nanjing University
Hexin Liu
Hexin Liu
Nanyang Technological University
Speech recognitionlanguage identification
M
Mingchen Shao
Audio, Speech and Language Processing Group (ASLP@NPU), School of Computer Science, Northwestern Polytechnical University, Xi’an, China
L
Lei Xie
Audio, Speech and Language Processing Group (ASLP@NPU), School of Computer Science, Northwestern Polytechnical University, Xi’an, China
E
Eng Siong Chng
College of Computing and Data Science, Nanyang Technological University, Singapore
L
Longshuai Xiao
Huawei Technologies, China
Q
Qiangze Feng
Nexdata Technology Inc., USA
D
Daliang Wang
Nexdata Technology Inc., USA