Language Family Matters: Evaluating LLM-Based ASR Across Linguistic Boundaries

📅 2026-01-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inefficiency of current large language model (LLM)-based automatic speech recognition (ASR) systems, which train separate connectors for each language while disregarding linguistic phylogenetic relationships, leading to parameter redundancy and limited generalization. To overcome this, the study introduces a novel approach that incorporates language family information into the design of LLM-ASR connectors for the first time. It proposes a lightweight, language-family-shared connector that enables knowledge transfer across multiple languages within the same family, situated between a frozen speech encoder and a pretrained LLM. The method substantially reduces model parameters while demonstrating improved cross-lingual recognition performance on two multilingual LLMs and real-world speech corpora, achieving both deployment efficiency and enhanced generalization.

Technology Category

Application Category

📝 Abstract
Large Language Model (LLM)-powered Automatic Speech Recognition (ASR) systems achieve strong performance with limited resources by linking a frozen speech encoder to a pretrained LLM via a lightweight connector. Prior work trains a separate connector per language, overlooking linguistic relatedness. We propose an efficient and novel connector-sharing strategy based on linguistic family membership, enabling one connector per family, and empirically validate its effectiveness across two multilingual LLMs and two real-world corpora spanning curated and crowd-sourced speech. Our results show that family-based connectors reduce parameter count while improving generalization across domains, offering a practical and scalable strategy for multilingual ASR deployment.
Problem

Research questions and friction points this paper is trying to address.

Language Family
LLM-based ASR
Multilingual ASR
Connector Sharing
Linguistic Relatedness
Innovation

Methods, ideas, or system contributions that make the work stand out.

language family
connector sharing
multilingual ASR
LLM-based speech recognition
parameter efficiency
🔎 Similar Papers
No similar papers found.
Yuchen Zhang
Yuchen Zhang
University of Essex
Deep LearningFake News DetectionNatural Language ProcessingSocial Computation
Ravi Shekhar
Ravi Shekhar
University of Essex
Natural Language ProcessingComputer VisionMachine Learning
H
H. Mouratidis
Institute for Analytics and Data Science, University of Essex; School of Computer Science and Electronic Engineering, University of Essex