TalTech Systems for the Interspeech 2025 ML-SUPERB 2.0 Challenge

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

This work addresses the Interspeech 2025 ML-SUPERB 2.0 challenge, targeting efficient zero-shot language identification (LID) and multilingual automatic speech recognition (ASR) for low-resource languages. To meet stringent requirements for zero-shot generalization and rapid language adaptation, we propose a lightweight, unified framework featuring a novel hybrid LID architecture—comprising a shared encoder and language-specialized binary language models—and dynamically integrating three complementary components: fine-tuned SeamlessM4T, MMS-1B-all language adapters, and MMS zero-shot transfer. Leveraging pretrained language embeddings and cross-lingual shared representations, our approach significantly enhances zero-shot generalization capability and fine-tuning efficiency. The system enables adaptive, language-specific deployment without architectural modification. Evaluated on the ML-SUPERB 2.0 benchmark, it achieves state-of-the-art performance, ranking first overall.

Technology Category

Application Category

📝 Abstract

This paper describes the language identification and multilingual speech recognition system developed at Tallinn University of Technology for the Interspeech 2025 ML-SUPERB 2.0 Challenge. A hybrid language identification system is used, consisting of a pretrained language embedding model and a light-weight speech recognition model with a shared encoder across languages and language-specific bigram language models. For speech recognition, three models are used, where only a single model is applied for each language, depending on the training data availability and performance on held-out data. The model set consists of a finetuned version of SeamlessM4T, MMS-1B-all with custom language adapters and MMS-zeroshot. The system obtained the top overall score in the challenge.

Problem

Research questions and friction points this paper is trying to address.

Develops hybrid language identification system for multilingual speech

Uses multiple speech recognition models tailored per language

Achieves top score in ML-SUPERB 2.0 Challenge

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid language identification system

Shared encoder across languages

Custom language adapters for MMS-1B-all

🔎 Similar Papers

No similar papers found.