IITKGP-ABSP Submission to LRE22: Language Recognition in Low-Resource Settings

📅 2025-01-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses language identification for 14 low-resource African languages under extreme constraints—using only the LRE22 development set and prohibiting pre-trained models. Method: We propose the first fully pre-training-free, data-augmentation-driven multi-classifier fusion framework. It employs audio diversity augmentation—including time-frequency masking, speed perturbation, and additive noise—to extract x-vector embeddings, then fuses SVM and ECAPA-TDNN classifiers. The design prioritizes both low-resource adaptability and edge-deployment efficiency. Contribution/Results: Evaluated on the LRE22 development set, our framework achieves an EER of 11.43% and Cavg of 0.41—substantially outperforming baseline methods. This demonstrates the effectiveness and practicality of pre-training-free paradigms for ultra-low-resource spoken language identification.

Technology Category

Application Category

📝 Abstract
This is the detailed system description of the IITKGP-ABSP lab's submission to the NIST language recognition evaluation (LRE) 2022. The objective of this LRE (LRE22) is focused on recognizing 14 low-resourced African languages. Even though NIST has provided additional training and development data, we develop our systems with additional constraints of extreme low-resource. Our primary fixed-set submission ensures the usage of only the LRE 22 development data that contains the utterances of 14 target languages. We further restrict our system from using any pre-trained models for feature extraction or classifier fine-tuning. To address the issue of low-resource, our system relies on diverse audio augmentations followed by classifier fusions. Abiding by all the constraints, the proposed methods achieve an EER of 11.43% and cost metric of 0.41 in the LRE22 development set. For users with limited computational resources or limited storage/network capabilities, the proposed system will help achieve efficient LID performance.
Problem

Research questions and friction points this paper is trying to address.

African rare language identification
Limited data
Accuracy improvement
Innovation

Methods, ideas, or system contributions that make the work stand out.

Minimal Resource Conditions
Audio Processing Techniques
Error Rate Reduction
🔎 Similar Papers
No similar papers found.
S
Spandan Dey
ABSP Lab, Department of E & ECE, Indian Institute of Technology Kharagpur, India
Md Sahidullah
Md Sahidullah
TCG CREST & Academy of Scientific and Innovative Research (AcSIR)
Signal ProcessingSpeech ProcessingRepresentation LearningMachine Learning
G
Goutam Saha
ABSP Lab, Department of E & ECE, Indian Institute of Technology Kharagpur, India