🤖 AI Summary
Speech data for disordered speech in low-resource languages is scarce, hindering the equitable deployment of automatic speech recognition (ASR) for persons with speech disabilities. Method: We propose a community-driven paradigm for data collection and model development, piloted on Akan—the most widely spoken language in Ghana—by creating the first open-source disordered speech corpus for Akan (Akan-Disordered Speech Corpus). We accompany it with a reusable data collection “recipe,” a lightweight speech annotation tool, and adaptation guidelines. Leveraging open collaboration, local communities co-collect data and fine-tune open ASR models (Whisper, Wav2Vec 2.0) for improved articulation disorder recognition. Contribution/Results: This work establishes the first systematic, democratized pipeline for building disordered-speech ASR in low-resource languages, offering a transferable methodology and practical benchmark for disordered speech research in under-resourced linguistic contexts globally.
📝 Abstract
This study presents an approach for collecting speech samples to build Automatic Speech Recognition (ASR) models for impaired speech, particularly, low-resource languages. It aims to democratize ASR technology and data collection by developing a "cookbook" of best practices and training for community-driven data collection and ASR model building. As a proof-of-concept, this study curated the first open-source dataset of impaired speech in Akan: a widely spoken indigenous language in Ghana. The study involved participants from diverse backgrounds with speech impairments. The resulting dataset, along with the cookbook and open-source tools, are publicly available to enable researchers and practitioners to create inclusive ASR technologies tailored to the unique needs of speech impaired individuals. In addition, this study presents the initial results of fine-tuning open-source ASR models to better recognize impaired speech in Akan.