Ara-Best-RQ: Multi Dialectal Arabic SSL

📅 2026-03-23

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

This work addresses the long-standing scarcity of efficient, dialect-specific self-supervised models for multilingual Arabic speech processing. It presents the first self-supervised pretraining approach tailored to the diverse family of Arabic dialects, leveraging the Conformer architecture within the BEST-RQ framework and pretrained on 5,640 hours of web-crawled and publicly available speech data. The resulting model achieves state-of-the-art performance in dialect identification with fewer parameters and significantly outperforms existing general-purpose multilingual or non-Arabic monolingual models on automatic speech recognition tasks. These results demonstrate the effectiveness and superiority of domain-customized pretraining for low-resource, linguistically heterogeneous language varieties such as Arabic dialects.

Technology Category

Application Category

📝 Abstract

We present Ara-BEST-RQ, a family of self-supervised learning (SSL) models specifically designed for multi-dialectal Arabic speech processing. Leveraging 5,640 hours of crawled Creative Commons speech and combining it with publicly available datasets, we pre-train conformer-based BEST-RQ models up to 600M parameters. Our models are evaluated on dialect identification (DID) and automatic speech recognition (ASR) tasks, achieving state-of-the-art performance on the former while using fewer parameters than competing models. We demonstrate that family-targeted pre-training on Arabic dialects significantly improves downstream performance compared to multilingual or monolingual models trained on non-Arabic data. All models, code, and pre-processed datasets will be publicly released to support reproducibility and further research in Arabic speech technologies.

Problem

Research questions and friction points this paper is trying to address.

multi-dialectal Arabic

self-supervised learning

speech processing

dialect identification

automatic speech recognition

Innovation

Methods, ideas, or system contributions that make the work stand out.

self-supervised learning

multi-dialectal Arabic

BEST-RQ