🤖 AI Summary
This study addresses the lack of systematic open-source solutions and comprehensive evaluation for Esperanto in modern machine translation. It presents the first systematic benchmark of open-source machine translation systems tailored for Esperanto, evaluating rule-based approaches, encoder-decoder architectures (notably NLLB), and fine-tuned large language models across six language directions using both automatic metrics and human assessment. Results indicate that the NLLB family of models consistently achieves the strongest performance, being preferred by human evaluators in approximately half of the tested scenarios. The project releases high-performing models and code, thereby filling a critical research gap for Esperanto in multilingual AI and fostering the development of its technical ecosystem.
📝 Abstract
Esperanto is a widespread constructed language, known for its regular grammar and productive word formation. Besides having substantial resources available thanks to its online community, it remains relatively underexplored in the context of modern machine translation (MT) approaches. In this work, we present the first comprehensive evaluation of open-source MT systems for Esperanto, comparing rule-based systems, encoder-decoder models, and LLMs across model sizes. We evaluate translation quality across six language directions involving English, Spanish, Catalan, and Esperanto using multiple automatic metrics as well as human evaluation. Our results show that the NLLB family achieves the best performance in all language pairs, followed closely by our trained compact models and a fine-tuned general-purpose LLM. Human evaluation confirms this trend, with NLLB translations preferred in approximately half of the comparisons, although noticeable errors remain. In line with Esperanto's tradition of openness and international collaboration, we release our code and best-performing models publicly.