Multilingual Definition Modeling

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This paper presents the first systematic study on multilingual definition modeling, focusing on Spanish, French, Portuguese, and German, and introduces the first cross-lingual paraphrase generation benchmark. Methodologically, it fine-tunes multilingual pretrained models (e.g., mBERT, XLM-R) and conducts zero-shot evaluation of large language models (LLMs) including ChatGPT and Llama series; evaluation employs dual validation via BERTScore and human assessment. Key contributions are: (1) revealing that current multilingual models fail to leverage cross-lingual synergies—achieving English-level performance per language but no cross-lingual gain; (2) demonstrating superior zero-/few-shot paraphrasing capability of LLMs, with higher naturalness and stability; and (3) identifying strong correlation between BERTScore and mainstream multilingual LLM benchmarks, supporting its use as a lightweight, interpretable alternative for multilingual evaluation.

Technology Category

Application Category

📝 Abstract

In this paper, we propose the first multilingual study on definition modeling. We use monolingual dictionary data for four new languages (Spanish, French, Portuguese, and German) and perform an in-depth empirical study to test the performance of pre-trained multilingual language models on definition modeling of monosemic words when finetuned on this data. Furthermore, we use a zero-shot approach to test the multilingual capabilities of two popular chat-based Large Language Models (LLMs) in the task. Results show that multilingual language models can perform on-pair with English but cannot leverage potential cross-lingual synergies, with LLMs generally offering better performance overall. A comprehensive human evaluation of the LLM-generated definition highlights the zero and few-shot capabilities of these models in this new task, also showing their shortcomings. Finally, we show that performance on our task via BERTScore strongly correlates to the performance on multilingual LLM benchmarks, suggesting that our task offers a viable compute-constrained, stable and natural alternative to these.

Problem

Research questions and friction points this paper is trying to address.

Multilingual definition modeling for four new languages

Testing performance of pre-trained multilingual models on monosemic words

Evaluating zero-shot capabilities of LLMs in definition generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilingual language models finetuned on dictionary data

Zero-shot approach with chat-based LLMs

BERTScore correlates with multilingual benchmarks

🔎 Similar Papers

No similar papers found.