Enabling Automatic Disordered Speech Recognition: An Impaired Speech Dataset in the Akan Language

📅 2026-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the critical scarcity of disordered speech data in low-resource languages, which severely hinders the development of inclusive speech technologies. To bridge this gap, we present the first multi-category disordered speech dataset for Akan speakers, encompassing four types of speech impairments: stuttering, cerebral palsy, cleft lip and palate, and post-stroke dysarthria. High-quality speech samples were collected through a controlled image description task, supplemented with manual transcriptions and structured metadata annotations. The resulting corpus comprises 50.01 hours of meticulously curated speech data, establishing a foundational resource that fills a significant void in Akan-language inclusive speech recognition and supports future research on assistive technologies for speakers with speech disorders in low-resource settings.

Technology Category

Application Category

📝 Abstract
The lack of impaired speech data hinders advancements in the development of inclusive speech technologies, particularly in low-resource languages such as Akan. To address this gap, this study presents a curated corpus of speech samples from native Akan speakers with speech impairment. The dataset comprises of 50.01 hours of audio recordings cutting across four classes of impaired speech namely stammering, cerebral palsy, cleft palate, and stroke induced speech disorder. Recordings were done in controlled supervised environments were participants described pre-selected images in their own words. The resulting dataset is a collection of audio recordings, transcriptions, and associated metadata on speaker demographics, class of impairment, recording environment and device. The dataset is intended to support research in low-resource automatic disordered speech recognition systems and assistive speech technology.
Problem

Research questions and friction points this paper is trying to address.

disordered speech recognition
impaired speech dataset
low-resource languages
Akan language
speech technology
Innovation

Methods, ideas, or system contributions that make the work stand out.

disordered speech recognition
low-resource language
impaired speech dataset
Akan language
assistive speech technology
🔎 Similar Papers
No similar papers found.
I
Isaac Wiafe
Department of Computer Science, University of Ghana
A
Akon Obu Ekpezu
Department of Computer Science, University of Ghana
S
Sumaya Ahmed Salihs
Department of Computer Science, University of Ghana
E
Elikem Doe Atsakpo
Department of Computer Science, University of Ghana
F
Fiifi Baffoe Payin Winful
Department of Computer Science, University of Ghana
Jamal-Deen Abdulai
Jamal-Deen Abdulai
Dr of Computer Science, University of Ghana
Computer NetworkingWireless Communication SystemsSensor NetworksAI and Machine Learning