Mixture of LoRA Experts for Low-Resourced Multi-Accent Automatic Speech Recognition

📅 2025-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address poor robustness of non-native speech ASR in low-resource, multi-accent scenarios—and the challenge of accent-agnostic adaptation during inference—this paper proposes MAS-LoRA, an accent-specific Mixture-of-Experts (MoE) framework based on Low-Rank Adaptation (LoRA). Built upon the Whisper architecture, MAS-LoRA dynamically activates dedicated LoRA experts per accent, mitigating catastrophic forgetting and enabling zero-shot or few-shot accent adaptation within a single model, even for unseen accents. On the L2-ARCTIC multi-accent benchmark, MAS-LoRA achieves significantly lower WER than standard LoRA and full-parameter fine-tuning under accent-unknown conditions; performance further improves when accent identity is known. To our knowledge, this is the first work to enable accent-aware ASR adaptation without retraining, striking a balance between generalization and computational efficiency.

Technology Category

Application Category

📝 Abstract
We aim to improve the robustness of Automatic Speech Recognition (ASR) systems against non-native speech, particularly in low-resourced multi-accent settings. We introduce Mixture of Accent-Specific LoRAs (MAS-LoRA), a fine-tuning method that leverages a mixture of Low-Rank Adaptation (LoRA) experts, each specialized in a specific accent. This method can be used when the accent is known or unknown at inference time, without the need to fine-tune the model again. Our experiments, conducted using Whisper on the L2-ARCTIC corpus, demonstrate significant improvements in Word Error Rate compared to regular LoRA and full fine-tuning when the accent is unknown. When the accent is known, the results further improve. Furthermore, MAS-LoRA shows less catastrophic forgetting than the other fine-tuning methods. To the best of our knowledge, this is the first use of a mixture of LoRA experts for non-native multi-accent ASR.
Problem

Research questions and friction points this paper is trying to address.

Improve ASR robustness for low-resourced multi-accent speech
Leverage accent-specific LoRA experts without re-tuning
Reduce catastrophic forgetting in fine-tuning methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture of Accent-Specific LoRA experts
Handles known and unknown accents
Reduces catastrophic forgetting significantly
🔎 Similar Papers
No similar papers found.
R
Raphael Bagat
Université de Lorraine, CNRS, Inria, LORIA, F-54000 Nancy, France
I
I. Illina
Université de Lorraine, CNRS, Inria, LORIA, F-54000 Nancy, France
Emmanuel Vincent
Emmanuel Vincent
Senior Research Scientist, Inria
speech & audio