A Multi-Dialectal Dataset for German Dialect ASR and Dialect-to-Standard Speech Translation

📅 2025-06-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
German multilectal ASR research is hindered by scarce dialectal speech data and the absence of robust, standardized evaluation benchmarks. To address this, we introduce Betthupferl—the first publicly available speech-to-dual-transcription dataset covering three Southeast German dialects (Franconian, Bavarian, Alemannic) alongside Standard German, enabling both dialect identification and end-to-end dialect-to-Standard German speech translation. We propose a linguistically grounded, controllable normalization evaluation protocol that quantifies dialectal retention versus grammatical standardization. Using state-of-the-art multilingual models—including Whisper and SeamlessM4T—we conduct systematic benchmarking. Results reveal substantial inconsistency in grammatical normalization: while some outputs approximate Standard German, most retain dialect-specific syntactic and morphological structures. This work delivers a reproducible multilectal ASR and speech translation benchmark, diagnostic error analysis tools, and a linguistics-informed evaluation framework for dialectal language processing.

Technology Category

Application Category

📝 Abstract
Although Germany has a diverse landscape of dialects, they are underrepresented in current automatic speech recognition (ASR) research. To enable studies of how robust models are towards dialectal variation, we present Betthupferl, an evaluation dataset containing four hours of read speech in three dialect groups spoken in Southeast Germany (Franconian, Bavarian, Alemannic), and half an hour of Standard German speech. We provide both dialectal and Standard German transcriptions, and analyze the linguistic differences between them. We benchmark several multilingual state-of-the-art ASR models on speech translation into Standard German, and find differences between how much the output resembles the dialectal vs. standardized transcriptions. Qualitative error analyses of the best ASR model reveal that it sometimes normalizes grammatical differences, but often stays closer to the dialectal constructions.
Problem

Research questions and friction points this paper is trying to address.

Underrepresentation of German dialects in ASR research
Need for robust models handling dialectal speech variation
Challenges in dialect-to-standard German speech translation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-dialectal dataset for German ASR
Benchmarking multilingual ASR models
Analyzing dialect-to-standard translation errors