Automatic Speech Recognition (ASR) for African Low-Resource Languages: A Systematic Literature Review

📅 2025-10-01

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

Over 2,000 low-resource African languages remain severely underrepresented in automatic speech recognition (ASR), impeding digital inclusion. Method: Following the PRISMA 2020 guidelines, we systematically review 2020–2025 literature, synthesizing 74 datasets spanning 111 languages and evaluating model architectures, training paradigms (e.g., self-supervised and transfer learning), and performance metrics (WER, CER). Contribution/Results: We identify three core bottlenecks: acute data scarcity, poor annotation quality, and absence of standardized benchmarks; moreover, fewer than 15% of studies provide reproducible artifacts or explicit licensing. We propose a novel, community-driven paradigm emphasizing lightweight modeling, ethically grounded data curation, and participatory development. Finally, we outline actionable pathways toward sustainable ASR advancement for African languages. This work delivers the first comprehensive benchmark analysis and practical roadmap for African-language ASR research.

Technology Category

Application Category

📝 Abstract

ASR has achieved remarkable global progress, yet African low-resource languages remain rigorously underrepresented, producing barriers to digital inclusion across the continent with more than +2000 languages. This systematic literature review (SLR) explores research on ASR for African languages with a focus on datasets, models and training methods, evaluation techniques, challenges, and recommends future directions. We employ the PRISMA 2020 procedures and search DBLP, ACM Digital Library, Google Scholar, Semantic Scholar, and arXiv for studies published between January 2020 and July 2025. We include studies related to ASR datasets, models or metrics for African languages, while excluding non-African, duplicates, and low-quality studies (score<3/5). We screen 71 out of 2,062 records and we record a total of 74 datasets across 111 languages, encompassing approximately 11,206 hours of speech. Fewer than 15% of research provided reproducible materials, and dataset licensing is not clear. Self-supervised and transfer learning techniques are promising, but are hindered by limited pre-training data, inadequate coverage of dialects, and the availability of resources. Most of the researchers use Word Error Rate (WER), with very minimal use of linguistically informed scores such as Character Error Rate (CER) or Diacritic Error Rate (DER), and thus with limited application in tonal and morphologically rich languages. The existing evidence on ASR systems is inconsistent, hindered by issues like dataset availability, poor annotations, licensing uncertainties, and limited benchmarking. Nevertheless, the rise of community-driven initiatives and methodological advancements indicates a pathway for improvement. Sustainable development for this area will also include stakeholder partnership, creation of ethically well-balanced datasets, use of lightweight modelling techniques, and active benchmarking.

Problem

Research questions and friction points this paper is trying to address.

Addressing ASR underrepresentation for African low-resource languages

Identifying challenges in datasets, models, and evaluation techniques

Recommending sustainable development strategies for ASR improvement

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic review of datasets, models, and training methods

Self-supervised and transfer learning techniques for low-resource languages

Lightweight modeling techniques with active benchmarking approaches

🔎 Similar Papers

No similar papers found.