🤖 AI Summary
This study systematically evaluates the natural language inference (NLI) capabilities of large language models (LLMs) on Basque and Spanish regional dialects, addressing performance degradation induced by dialectal variation. To this end, we introduce the first manually annotated, cross-dialectally aligned parallel NLI dataset—covering under-resourced varieties such as Western Basque. Our experiments span both encoder-only and decoder-based architectures, incorporating cross-lingual transfer and in-context learning to enable the first systematic assessment of cross-dialectal NLI. Results demonstrate that dialectal divergence—not lexical overlap—is the primary factor undermining model accuracy; Western Basque proves especially challenging for encoder-only models, empirically supporting the linguistic hypothesis that “peripheral dialects deviate more substantially from standard varieties.” All data, code, and model analysis tools are publicly released.
📝 Abstract
In this paper, we evaluate the capacity of current language technologies to understand Basque and Spanish language varieties. We use Natural Language Inference (NLI) as a pivot task and introduce a novel, manually-curated parallel dataset in Basque and Spanish, along with their respective variants. Our empirical analysis of crosslingual and in-context learning experiments using encoder-only and decoder-based Large Language Models (LLMs) shows a performance drop when handling linguistic variation, especially in Basque. Error analysis suggests that this decline is not due to lexical overlap, but rather to the linguistic variation itself. Further ablation experiments indicate that encoder-only models particularly struggle with Western Basque, which aligns with linguistic theory that identifies peripheral dialects (e.g., Western) as more distant from the standard. All data and code are publicly available.