Synthetic Fluency: Hallucinations, Confabulations, and the Creation of Irish Words in LLM-Generated Translations

📅 2025-04-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates hallucination—specifically, the generation of non-existent lexical items—in large language models (LLMs) during Irish translation, examining its potential implicit influence on the lexical evolution of low-resource, highly inflectional languages. Method: We systematically compare translations produced by GPT-4.o and GPT-4.o Mini, employing linguistic annotation, morphological rule validation, and qualitative discourse analysis. Contribution/Results: We identify, for the first time, six systematic noun hallucination patterns and categorize verb-related hallucinations. Both models exhibit highly consistent hallucination typologies, yet GPT-4.o Mini generates them significantly more frequently. Crucially, most hallucinated forms superficially conform to Irish morphosyntactic constraints, suggesting LLMs may contribute to language evolution via a “synthetic fluency” mechanism—producing plausible but unattested forms. The study introduces a novel conceptual framework to assess LLM-induced linguistic impact, offering both methodological grounding and empirical evidence for evaluating LLM effects on endangered and low-resource languages.

Technology Category

Application Category

📝 Abstract
This study examines hallucinations in Large Language Model (LLM) translations into Irish, specifically focusing on instances where the models generate novel, non-existent words. We classify these hallucinations within verb and noun categories, identifying six distinct patterns among the latter. Additionally, we analyse whether these hallucinations adhere to Irish morphological rules and what linguistic tendencies they exhibit. Our findings show that while both GPT-4.o and GPT-4.o Mini produce similar types of hallucinations, the Mini model generates them at a significantly higher frequency. Beyond classification, the discussion raises speculative questions about the implications of these hallucinations for the Irish language. Rather than seeking definitive answers, we offer food for thought regarding the increasing use of LLMs and their potential role in shaping Irish vocabulary and linguistic evolution. We aim to prompt discussion on how such technologies might influence language over time, particularly in the context of low-resource, morphologically rich languages.
Problem

Research questions and friction points this paper is trying to address.

Examines hallucinations in LLM translations into Irish
Classifies novel non-existent words in verb and noun categories
Analyzes adherence to Irish morphological rules and linguistic tendencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Classifies hallucinations in LLM Irish translations
Analyzes adherence to Irish morphological rules
Compares hallucination frequency between GPT-4.o models
🔎 Similar Papers
No similar papers found.
Sheila Castilho
Sheila Castilho
SALIS/ADAPT Centre - Dublin City University
machine translationMT evaluationNLP
Z
Zoe Fitzsimmons
SALIS, Dublin City University
C
Claire Holton
SALIS, Dublin City University
A
Aoife Mc Donagh
SALIS, Dublin City University