Code-Switching in End-to-End Automatic Speech Recognition: A Systematic Literature Review

📅 2025-07-10

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Code-switching (CS) in end-to-end automatic speech recognition (ASR) remains underexplored, with fragmented efforts lacking systematic synthesis, standardized evaluation, and balanced multilingual resources. Method: We conduct the first structured literature review of CS-ASR research (2018–2023), analyzing papers from ACL, INTERSPEECH, and ICASSP to categorize language pairs, open datasets, model architectures (e.g., Conformer, fine-tuned Whisper), and evaluation protocols—including language-specific and mixed-segment WER/CER. Contribution/Results: We identify three core challenges: severe imbalance in multilingual resource availability, limited cross-lingual modeling capacity, and absence of unified evaluation benchmarks. To address these, we propose a reproducible CS-ASR research taxonomy and a curated public resource inventory, offering theoretical foundations and practical guidance for tackling data scarcity, semantic consistency modeling, and fine-grained evaluation—key bottlenecks in robust CS-ASR development.

Technology Category

Application Category

📝 Abstract

Motivated by a growing research interest into automatic speech recognition (ASR), and the growing body of work for languages in which code-switching (CS) often occurs, we present a systematic literature review of code-switching in end-to-end ASR models. We collect and manually annotate papers published in peer reviewed venues. We document the languages considered, datasets, metrics, model choices, and performance, and present a discussion of challenges in end-to-end ASR for code-switching. Our analysis thus provides insights on current research efforts and available resources as well as opportunities and gaps to guide future research.

Problem

Research questions and friction points this paper is trying to address.

Reviewing code-switching in end-to-end ASR models

Analyzing languages, datasets, and metrics in CS-ASR

Identifying challenges and gaps in CS-ASR research

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic review of code-switching in ASR

Manual annotation of peer-reviewed papers

Analysis of languages, datasets, and models

🔎 Similar Papers

Improving Zero-Shot Chinese-English Code-Switching ASR with kNN-CTC and Gated Monolingual Datastores