MAC-SLU: Multi-Intent Automotive Cabin Spoken Language Understanding Benchmark

📅 2025-12-01
📈 Citations: 0
Influential: 0
📄 PDF

career value

221K/year
🤖 AI Summary
Existing SLU datasets suffer from limited scenario diversity, simplistic intent structures, and the absence of a unified, large-model-oriented evaluation benchmark. To address these limitations, we introduce MAC-SLU—the first multi-intent spoken language understanding dataset specifically designed for in-vehicle cabin environments—covering realistic complex interactions, context dependency, and concurrent multi-intent utterances. MAC-SLU supports both end-to-end and pipeline-based SLU paradigms and establishes the first unified, fine-grained evaluation benchmark for large language models (LLMs) and large audio-language models (LALMs) in the in-vehicle domain. Experimental results demonstrate that supervised fine-tuning substantially outperforms zero-shot in-context learning; moreover, end-to-end LALMs achieve performance comparable to pipeline methods while effectively mitigating ASR error propagation. This work fills a critical gap in multi-intent in-vehicle SLU benchmarking and advances the rigorous evaluation and practical deployment of foundation models in real-world speech understanding tasks.

Technology Category

Application Category

📝 Abstract
Spoken Language Understanding (SLU), which aims to extract user semantics to execute downstream tasks, is a crucial component of task-oriented dialog systems. Existing SLU datasets generally lack sufficient diversity and complexity, and there is an absence of a unified benchmark for the latest Large Language Models (LLMs) and Large Audio Language Models (LALMs). This work introduces MAC-SLU, a novel Multi-Intent Automotive Cabin Spoken Language Understanding Dataset, which increases the difficulty of the SLU task by incorporating authentic and complex multi-intent data. Based on MAC-SLU, we conducted a comprehensive benchmark of leading open-source LLMs and LALMs, covering methods like in-context learning, supervised fine-tuning (SFT), and end-to-end (E2E) and pipeline paradigms. Our experiments show that while LLMs and LALMs have the potential to complete SLU tasks through in-context learning, their performance still lags significantly behind SFT. Meanwhile, E2E LALMs demonstrate performance comparable to pipeline approaches and effectively avoid error propagation from speech recognition. Codefootnote{https://github.com/Gatsby-web/MAC_SLU} and datasetsfootnote{huggingface.co/datasets/Gatsby1984/MAC_SLU} are released publicly.
Problem

Research questions and friction points this paper is trying to address.

Introduces a multi-intent automotive SLU dataset to increase task difficulty
Benchmarks LLMs and LALMs on SLU using in-context learning and fine-tuning
Evaluates end-to-end LALMs versus pipeline methods to avoid error propagation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces MAC-SLU dataset with complex multi-intent data
Benchmarks LLMs and LALMs using in-context learning and fine-tuning
Shows end-to-end LALMs avoid speech recognition error propagation
🔎 Similar Papers
Y
Yuezhang Peng
School of Computer Science, MoE Key Lab of Artificial Intelligence, Shanghai Jiao Tong University
C
Chonghao Cai
School of Computer Science, MoE Key Lab of Artificial Intelligence, Shanghai Jiao Tong University
Z
Ziang Liu
School of Computer Science and Engineering, Central South University
Shuai Fan
Shuai Fan
AISpeech Co., Ltd
Sheng Jiang
Sheng Jiang
Carnegie Mellon University
Storage SystemsNetworked SystemsDistributed Computing
H
Hua Xu
AISpeech Co., Ltd
Y
Yuxin Liu
School of Computer Science, MoE Key Lab of Artificial Intelligence, Shanghai Jiao Tong University
Qiguang Chen
Qiguang Chen
Harbin Institute of Technology
Chain-of-ThoughtReasoningMultilingual LLMMulti-modal LLM
K
Kele Xu
National University of Defense Technology
Y
Yao Li
Shanghai Aviation Electric Co., Ltd
S
Sheng Wang
School of Computer Science, MoE Key Lab of Artificial Intelligence, Shanghai Jiao Tong University; Shanghai Aviation Electric Co., Ltd
L
Libo Qin
School of Computer Science and Engineering, Central South University
X
Xie Chen
School of Computer Science, MoE Key Lab of Artificial Intelligence, Shanghai Jiao Tong University