🤖 AI Summary
This paper addresses the lack of a unified benchmark for Modern Standard Arabic (MSA) pronunciation assessment by introducing QuranMB.v1—the first open-source, standardized benchmark for MSA mispronunciation detection, grounded in Quranic recitation. Methodologically: (1) a customized phoneme inventory is designed to reflect MSA’s phonetic characteristics, particularly emphatic consonants and vowel length distinctions; (2) a rigorous evaluation pipeline is established, encompassing data cleaning, forced alignment, and expert-annotated mispronunciation labeling; and (3) multiple baselines—including ASR-based and pronunciation scoring models—are systematically evaluated. Key contributions include: the first standardized framework for Arabic pronunciation assessment; the first publicly available MSA mispronunciation benchmark tailored to religious recitation; and an empirical analysis revealing core challenges in MSA pronunciation modeling (e.g., pharyngeal articulation, vowel quantity contrast) alongside current model performance ceilings—providing a reproducible, extensible benchmark for future research.
📝 Abstract
We present a unified benchmark for mispronunciation detection in Modern Standard Arabic (MSA) using Qur'anic recitation as a case study. Our approach lays the groundwork for advancing Arabic pronunciation assessment by providing a comprehensive pipeline that spans data processing, the development of a specialized phoneme set tailored to the nuances of MSA pronunciation, and the creation of the first publicly available test set for this task, which we term as the Qur'anic Mispronunciation Benchmark (QuranMB.v1). Furthermore, we evaluate several baseline models to provide initial performance insights, thereby highlighting both the promise and the challenges inherent in assessing MSA pronunciation. By establishing this standardized framework, we aim to foster further research and development in pronunciation assessment in Arabic language technology and related applications.