IIITH-BUT system for IWSLT 2025 low-resource Bhojpuri to Hindi speech translation

📅 2025-06-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the low-resource Bhojpuri→Hindi speech translation task in IWSLT 2025. Building upon the SeamlessM4T framework, we systematically investigate the impact of hyperparameter configurations—including learning rate scheduling, warm-up steps, and label smoothing—on performance under data scarcity. We further propose, for the first time, a Bhojpuri–Marathi cross-lingual joint training strategy to mitigate limited Bhojpuri data. Augmented with speed perturbation and SpecAugment, and guided by BLEU-oriented error analysis, our approach significantly improves translation quality. Experiments demonstrate substantial BLEU score gains on the IWSLT 2025 benchmark and reveal phoneme confusion and word-order misalignment as dominant error types. Our key contributions are: (1) the first fine-grained hyperparameter sensitivity analysis tailored to Bhojpuri speech translation; (2) a novel cross-lingual joint training paradigm; and (3) a reproducible, transferable optimization framework for low-resource speech translation.

Technology Category

Application Category

📝 Abstract
This paper presents the submission of IIITH-BUT to the IWSLT 2025 shared task on speech translation for the low-resource Bhojpuri-Hindi language pair. We explored the impact of hyperparameter optimisation and data augmentation techniques on the performance of the SeamlessM4T model fine-tuned for this specific task. We systematically investigated a range of hyperparameters including learning rate schedules, number of update steps, warm-up steps, label smoothing, and batch sizes; and report their effect on translation quality. To address data scarcity, we applied speed perturbation and SpecAugment and studied their effect on translation quality. We also examined the use of cross-lingual signal through joint training with Marathi and Bhojpuri speech data. Our experiments reveal that careful selection of hyperparameters and the application of simple yet effective augmentation techniques significantly improve performance in low-resource settings. We also analysed the translation hypotheses to understand various kinds of errors that impacted the translation quality in terms of BLEU.
Problem

Research questions and friction points this paper is trying to address.

Optimizing hyperparameters for Bhojpuri-Hindi speech translation
Applying data augmentation to address low-resource constraints
Evaluating cross-lingual training with Marathi and Bhojpuri data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hyperparameter optimization for SeamlessM4T fine-tuning
Data augmentation with speed perturbation and SpecAugment
Cross-lingual training using Marathi and Bhojpuri data
🔎 Similar Papers
No similar papers found.
B
Bhavana Akkiraju
International Institute of Information Technology, Hyderabad, India
A
Aishwarya Pothula
International Institute of Information Technology, Hyderabad, India
Santosh Kesiraju
Santosh Kesiraju
Brno University of Technology
Speech and language processingMachine learning
Anil Kumar Vuppala
Anil Kumar Vuppala
Associate Professor, LTRC, IIIT Hyderabad
Speech signal processing