Open Automatic Speech Recognition Models for Classical and Modern Standard Arabic

📅 2025-07-18

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Arabic automatic speech recognition (ASR) faces challenges including linguistic complexity, scarcity of open-source models, and insufficient dialectal coverage; existing work predominantly targets Modern Standard Arabic (MSA), neglecting Classical Arabic (CA) and multi-dialect joint modeling. This paper introduces the first open-source, end-to-end ASR model jointly supporting MSA and CA, built upon the FastConformer architecture. Our approach integrates large-scale data preprocessing, multi-task learning, and phoneme-aware training. On standard MSA benchmarks, the model achieves state-of-the-art (SOTA) performance; for CA diacritized recognition—a previously unaddressed task—it establishes the first SOTA accuracy while maintaining strong generalization to MSA. The complete model and training framework are publicly released, providing a scalable foundation for multi-dialect Arabic speech understanding.

Technology Category

Application Category

📝 Abstract

Despite Arabic being one of the most widely spoken languages, the development of Arabic Automatic Speech Recognition (ASR) systems faces significant challenges due to the language's complexity, and only a limited number of public Arabic ASR models exist. While much of the focus has been on Modern Standard Arabic (MSA), there is considerably less attention given to the variations within the language. This paper introduces a universal methodology for Arabic speech and text processing designed to address unique challenges of the language. Using this methodology, we train two novel models based on the FastConformer architecture: one designed specifically for MSA and the other, the first unified public model for both MSA and Classical Arabic (CA). The MSA model sets a new benchmark with state-of-the-art (SOTA) performance on related datasets, while the unified model achieves SOTA accuracy with diacritics for CA while maintaining strong performance for MSA. To promote reproducibility, we open-source the models and their training recipes.

Problem

Research questions and friction points this paper is trying to address.

Developing Arabic ASR models for Modern Standard and Classical Arabic

Addressing limited public Arabic ASR resources and language complexity

Creating unified methodology for Arabic speech and text processing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Universal methodology for Arabic speech processing

FastConformer models for MSA and Classical Arabic

Open-source models with SOTA performance

🔎 Similar Papers

No similar papers found.