🤖 AI Summary
This study addresses the automatic identification of English loanwords (anglicisms) in Spanish news texts. We propose a multi-paradigm detection framework and conduct a systematic comparison of large language models (LLMs), Transformer-based architectures, deep learning models, and rule-based systems. To advance research on low-resource linguistic phenomena, we introduce the first fine-grained evaluation benchmark specifically designed for Spanish anglicism detection, enabling rigorous assessment of model robustness to morphological variation, contextual dependency, and out-of-vocabulary terms. Experimental results show that the best fine-tuned LLM achieves an F1 score of 0.99—substantially outperforming traditional approaches (lowest F1: 0.17)—highlighting the critical role of semantic modeling and domain adaptation in cross-lingual lexical identification. Our work provides a reproducible methodology and empirical evidence supporting language preservation efforts, computational lexicography, and multilingual NLP development.
📝 Abstract
This paper summarizes the main findings of ADoBo 2025, the shared task on anglicism identification in Spanish proposed in the context of IberLEF 2025. Participants of ADoBo 2025 were asked to detect English lexical borrowings (or anglicisms) from a collection of Spanish journalistic texts. Five teams submitted their solutions for the test phase. Proposed systems included LLMs, deep learning models, Transformer-based models and rule-based systems. The results range from F1 scores of 0.17 to 0.99, which showcases the variability in performance different systems can have for this task.