🤖 AI Summary
This study addresses performance bottlenecks in machine translation (MT) and question answering (QA) for low-resource Slavic languages—specifically Ukrainian, Upper Sorbian, and Lower Sorbian. To overcome data scarcity, we propose a parameter-efficient, multi-task fine-tuning framework built upon Qwen2.5-3B-Instruct. Our method employs Parameter-Efficient Fine-Tuning (PEFT) to jointly optimize MT and multiple-choice QA tasks using heterogeneous training data from diverse translation corpora and QA benchmarks. We further enhance contextual understanding via Retrieval-Augmented Generation (RAG) and improve generalization through model ensembling. Experimental results demonstrate consistent and significant improvements over strong baselines across both tasks, particularly under few-shot settings. The approach exhibits strong robustness and cross-task transferability, validating its effectiveness for low-resource Slavic language processing. Overall, this work provides a scalable, resource-conscious methodology for advancing NLP applications in under-resourced Slavic languages.
📝 Abstract
This paper presents the JGU Mainz submission to the WMT25 Shared Task on LLMs with Limited Resources for Slavic Languages: Machine Translation and Question Answering, focusing on Ukrainian, Upper Sorbian, and Lower Sorbian. For each language, we jointly fine-tune a Qwen2.5-3B-Instruct model for both tasks with parameter-efficient finetuning. Our pipeline integrates additional translation and multiple-choice question answering (QA) data. For Ukrainian QA, we further use retrieval-augmented generation. We also apply ensembling for QA in Upper and Lower Sorbian. Experiments show that our models outperform the baseline on both tasks.