HausaNLP at SemEval-2025 Task 3: Towards a Fine-Grained Model-Aware Hallucination Detection

📅 2025-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses hallucination detection in multilingual large language model (LLM) outputs, proposing the first model-aware, fine-grained hallucination span localization framework. Focusing on English, it integrates natural language inference (NLI)-based semantic modeling with synthetic-data-driven fine-tuning of ModernBERT—trained on 400 high-quality samples—to explicitly capture hallucination generation mechanisms and severity. The method jointly optimizes semantic-logical consistency and boundary discrimination capability. Evaluated on the MU-SHROOM benchmark, it achieves an IoU score of 0.032 and a correlation of 0.422 between predicted hallucination confidence and human annotations. This represents the first empirical validation of model-aware, fine-grained hallucination detection, demonstrating both feasibility and effectiveness. The framework establishes a novel paradigm for interpretable, multilingual hallucination assessment, advancing beyond coarse-grained binary classification toward precise, explainable hallucination localization.

Technology Category

Application Category

📝 Abstract
This paper presents our findings of the Multilingual Shared Task on Hallucinations and Related Observable Overgeneration Mistakes, MU-SHROOM, which focuses on identifying hallucinations and related overgeneration errors in large language models (LLMs). The shared task involves detecting specific text spans that constitute hallucinations in the outputs generated by LLMs in 14 languages. To address this task, we aim to provide a nuanced, model-aware understanding of hallucination occurrences and severity in English. We used natural language inference and fine-tuned a ModernBERT model using a synthetic dataset of 400 samples, achieving an Intersection over Union (IoU) score of 0.032 and a correlation score of 0.422. These results indicate a moderately positive correlation between the model's confidence scores and the actual presence of hallucinations. The IoU score indicates that our model has a relatively low overlap between the predicted hallucination span and the truth annotation. The performance is unsurprising, given the intricate nature of hallucination detection. Hallucinations often manifest subtly, relying on context, making pinpointing their exact boundaries formidable.
Problem

Research questions and friction points this paper is trying to address.

Detect hallucinations in large language model outputs
Identify overgeneration errors across 14 languages
Measure hallucination severity and model confidence correlation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned ModernBERT for hallucination detection
Used synthetic dataset with 400 samples
Applied natural language inference techniques
🔎 Similar Papers
No similar papers found.
M
Maryam Bala
HausaNLP, University of Abuja, Bayero University Kano, Data Science for Social Impact, University of Pretoria, Imperial College London, Northeastern University
A
Amina Imam Abubakar
HausaNLP, University of Abuja
A
Abdulhamid Abubakar
HausaNLP
A
Abdulkadir Shehu Bichi
HausaNLP
Hafsa Kabir Ahmad
Hafsa Kabir Ahmad
Bayero University Kano
Network Representation Learningrecommender systemsonline education
S
Sani Abdullahi Sani
HausaNLP
Idris Abdulmumin
Idris Abdulmumin
Postdoctoral Fellow, DSFSI, University of Pretoria
Machine TranslationNeural Machine TranslationNatural Language ProcessingInternet Technology
S
Shamsuddeen Hassan Muhamad
HausaNLP, Bayero University Kano, Imperial College London
Ibrahim Said Ahmad
Ibrahim Said Ahmad
Northeastern University
Natural Language ProcessingBig DataData miningArtificial Intelligence