CDSD: Chinese Dysarthria Speech Database

📅 2023-10-24

🏛️ Interspeech

📈 Citations: 3

✨ Influential: 0

career value

230K/year

🤖 AI Summary

Speech recognition performance for dysarthric speakers remains suboptimal, primarily due to the scarcity of high-quality Mandarin Chinese dysarthric speech data. To address this, we introduce CDSD—the largest publicly available Mandarin dysarthric speech database to date—comprising 133 hours of multi-device synchronized recordings from 44 patients. CDSD is the first such resource to incorporate clinical severity-level annotations, rigorous speech quality control, and a standardized ASR benchmarking protocol. It fills a critical gap in Mandarin dysarthric speech resources and enables reproducible model evaluation. Benchmark experiments using CTC- and Transformer-based ASR models achieve a best-character error rate (CER) of 16.4%, substantially outperforming human transcription (20.45%). These results empirically validate that data-driven, dysarthria-specific ASR systems significantly enhance communication accessibility for affected individuals.

📝 Abstract

Dysarthric speech poses significant challenges for individuals with dysarthria, impacting their ability to communicate socially. Despite the widespread use of Automatic Speech Recognition (ASR), accurately recognizing dysarthric speech remains a formidable task, largely due to the limited availability of dysarthric speech data. To address this gap, we developed the Chinese Dysarthria Speech Database (CDSD), the most extensive collection of Chinese dysarthria data to date, featuring 133 hours of recordings from 44 speakers. Our benchmarks reveal a best Character Error Rate (CER) of 16.4%. Compared to the CER of 20.45% from our additional human experiments, Dysarthric Speech Recognition (DSR) demonstrates its potential in significant improvement of communication for individuals with dysarthria. The CDSD database will be made publicly available at http://melab.psych.ac.cn/CDSD.html.

Problem

Research questions and friction points this paper is trying to address.

Develops Chinese Dysarthria Speech Database

Addresses limited dysarthric speech data

Improves Automatic Speech Recognition accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed extensive Chinese dysarthria database

Improved dysarthric speech recognition accuracy

Achieved lower Character Error Rate

🔎 Similar Papers

No similar papers found.

Apple

Cambridge, United States of America

AI Research Scientist - Voice AI Team, Meta Superintelligence Labs