🤖 AI Summary
Code-switching (CS) in end-to-end automatic speech recognition (ASR) remains underexplored, with fragmented efforts lacking systematic synthesis, standardized evaluation, and balanced multilingual resources. Method: We conduct the first structured literature review of CS-ASR research (2018–2023), analyzing papers from ACL, INTERSPEECH, and ICASSP to categorize language pairs, open datasets, model architectures (e.g., Conformer, fine-tuned Whisper), and evaluation protocols—including language-specific and mixed-segment WER/CER. Contribution/Results: We identify three core challenges: severe imbalance in multilingual resource availability, limited cross-lingual modeling capacity, and absence of unified evaluation benchmarks. To address these, we propose a reproducible CS-ASR research taxonomy and a curated public resource inventory, offering theoretical foundations and practical guidance for tackling data scarcity, semantic consistency modeling, and fine-grained evaluation—key bottlenecks in robust CS-ASR development.
📝 Abstract
Motivated by a growing research interest into automatic speech recognition (ASR), and the growing body of work for languages in which code-switching (CS) often occurs, we present a systematic literature review of code-switching in end-to-end ASR models. We collect and manually annotate papers published in peer reviewed venues. We document the languages considered, datasets, metrics, model choices, and performance, and present a discussion of challenges in end-to-end ASR for code-switching. Our analysis thus provides insights on current research efforts and available resources as well as opportunities and gaps to guide future research.