A Parallel Cross-Lingual Benchmark for Multimodal Idiomaticity Understanding

📅 2026-01-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of understanding potentially idiomatic expressions (PIEs) in multilingual and multimodal systems, with a focus on their cross-lingual and cross-modal transfer capabilities. To this end, we introduce XMPIE—the first large-scale, parallel, multilingual multimodal benchmark for idiom comprehension—spanning 34 languages and featuring over ten thousand expert-annotated idioms. Each idiom is paired with a fine-grained image spectrum comprising five images that progressively illustrate its interpretation from literal to metaphorical meaning. Designed to support both cross-lingual and text-image multimodal evaluation, XMPIE enables the first systematic analysis of cultural commonalities and model generalization in idiom understanding, providing a high-quality, transferable evaluation platform for advancing research in this domain.

Technology Category

Application Category

📝 Abstract
Potentially idiomatic expressions (PIEs) construe meanings inherently tied to the everyday experience of a given language community. As such, they constitute an interesting challenge for assessing the linguistic (and to some extent cultural) capabilities of NLP systems. In this paper, we present XMPIE, a parallel multilingual and multimodal dataset of potentially idiomatic expressions. The dataset, containing 34 languages and over ten thousand items, allows comparative analyses of idiomatic patterns among language-specific realisations and preferences in order to gather insights about shared cultural aspects. This parallel dataset allows to evaluate model performance for a given PIE in different languages and whether idiomatic understanding in one language can be transferred to another. Moreover, the dataset supports the study of PIEs across textual and visual modalities, to measure to what extent PIE understanding in one modality transfers or implies in understanding in another modality (text vs. image). The data was created by language experts, with both textual and visual components crafted under multilingual guidelines, and each PIE is accompanied by five images representing a spectrum from idiomatic to literal meanings, including semantically related and random distractors. The result is a high-quality benchmark for evaluating multilingual and multimodal idiomatic language understanding.
Problem

Research questions and friction points this paper is trying to address.

idiomaticity
multilingual
multimodal
cross-lingual
NLP benchmark
Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-lingual benchmark
multimodal idiomaticity
parallel multilingual dataset
idiomatic expression understanding
text-image alignment
🔎 Similar Papers
No similar papers found.
D
Dilara Torunouglu-Selamet
D
Doğukan Arslan
Rodrigo Wilkens
Rodrigo Wilkens
University of Exeter
Natural Language ProcessingLanguage AcquisitionReadabilityText Simplification
W
Wei He
D
Doruk Eryiugit
T
Thomas Mark Rodbourn Pickard
A
Adriana Pagano
Aline Villavicencio
Aline Villavicencio
University of Exeter and University of Sheffield (UK)
Natural Language ProcessingMultiword ExpressionsLanguage Acquisition and Evolution
G
Gulcsen Eryiugit
'
'Agnes Abuczki
A
Aida Cardoso
A
Alesia Lazarenka
D
Dina Almassova
A
Amalia Mendes
A
Anna Kanellopoulou
A
Antoni Brosa-Rodr'iguez
B
Baiba Saulite
B
Beata Wójtowicz
B
Bolette S. Pedersen
C
Carlos Manuel Hidalgo-Ternero
C
Chaya Liebeskind
D
Danka Joki'c
D
Diego Alves
E
Eleni Triantafyllidi
Erik Velldal
Erik Velldal
Professor at the University of Oslo, Dept. of Informatics, Language Technology Group
Natural Language ProcessingMachine Learning
Fred Philippy
Fred Philippy
University of Luxembourg
Natural Language ProcessingDeep LearningData Science
G
G. Oleškevičienė
I
Ieva Rizgelienė
I
I. Skadina
I
Irina Lobzhanidze
I
Isabell Stinessen Haugen
J
Jauza Akbar Krito
J
Jelena M. Markovi'c
Johanna Monti
Johanna Monti
Professor of Modern Languages Teaching, Università di Napoli L'Orientale
Computational LinguisticsMachine TranslationComputer Aided TranslationLocalisationArtificial
J
Josue Alejandro Sauca
K
Kaja Dobrovoljc
K
Kingsley O. Ugwuanyi
L
Laura Rituma
L
Lilja Ovrelid
M
Maha Tufail Agro
M
Manzura Abjalova
M
M. Chatzigrigoriou
M
Mar'ia del Mar S'anchez Ramos
M
Marija Pendevska
M
Masoumeh Seyyedrezaei
M
M. Shamsfard
M
Momina Ahsan
M
Muhammad Ahsan Riaz Khan
N
Nathalie Carmen Hau Norman
N
N. Ayyıldız
N
Nina Hosseini-Kivanani
N
Noémi Ligeti-Nagy
N
Numaan Naeem
O
Olha Kanishcheva
O
Olha Yatsyshyna
D
D. Orel
P
Petra Giommarelli
P
P. Osenova
R
R. Garabík
R
Regina E. Semou
R
R. Rebechi
S
Salsabila Zahirah Pranida
Samia Touileb
Samia Touileb
University of Bergen
Natural Language ProcessingComputational LinguisticsUnder-resourced LanguagesBias and Fairness in NLP
S
Sanni Nimb
S
Sarfraz Ahmad
S
Sarvinoz Nematkhonova
S
Shahar Golan
Shaoxiong Ji
Shaoxiong Ji
Technical University of Darmstadt
Machine LearningNatural Language ProcessingHealth Informatics
S
Sopuruchi Christian Aboh
S
Srdjan Sucur
Stella Markantonatou
Stella Markantonatou
Institute for Language and Speech Processing, Athena Research Center
Multiword ExpressionsLess resourced languagesLexicaOntologiesSyntax
S
S. Olsen
V
Vahide Tajalli
V
Veronika Lipp
Voula Giouli
Voula Giouli
Assistant Professor, Aristotle University of Thessaloniki & ILSP - ATHENA RC
Computational LinguisticsNatural Language ProcessingLexical SemanticsDigital Humanities
Y
Yelda Yecsildal Eraydin
Zahra Saaberi
Zahra Saaberi
SBU NLP Lab, Shahid Beheshti University
Artificial IntelligenceNatural Language Processing
Zhuohan Xie
Zhuohan Xie
MBZUAI
Financial AIReasoningNatural Language ProcessingComputational LinguisticsDeep Learning