Evaluation of LLMs on Long-tail Entity Linking in Historical Documents

📅 2025-05-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study presents the first systematic evaluation of large language models (LLMs) on long-tail entity linking (EL) in historical documents—targeting low-frequency, domain-specific, and sparsely annotated historical entities. Using the manually curated MHERCL v0.1 benchmark, we employ zero-shot prompting with Wikidata as the external knowledge source and compare LLMs (GPT and Llama3) against traditional EL systems (e.g., ReLiK). Results demonstrate that LLMs significantly outperform baselines, achieving a 23.6% absolute accuracy gain on long-tail entities and substantially narrowing the performance gap between head and tail entities. Crucially, this improvement is attained without fine-tuning and with minimal reliance on external resources, underscoring LLMs’ practical utility and strong generalization capacity for low-resource historical natural language processing tasks. The findings highlight LLMs’ promise in addressing data scarcity and domain specificity challenges endemic to historical text analysis.

Technology Category

Application Category

📝 Abstract
Entity Linking (EL) plays a crucial role in Natural Language Processing (NLP) applications, enabling the disambiguation of entity mentions by linking them to their corresponding entries in a reference knowledge base (KB). Thanks to their deep contextual understanding capabilities, LLMs offer a new perspective to tackle EL, promising better results than traditional methods. Despite the impressive generalization capabilities of LLMs, linking less popular, long-tail entities remains challenging as these entities are often underrepresented in training data and knowledge bases. Furthermore, the long-tail EL task is an understudied problem, and limited studies address it with LLMs. In the present work, we assess the performance of two popular LLMs, GPT and LLama3, in a long-tail entity linking scenario. Using MHERCL v0.1, a manually annotated benchmark of sentences from domain-specific historical texts, we quantitatively compare the performance of LLMs in identifying and linking entities to their corresponding Wikidata entries against that of ReLiK, a state-of-the-art Entity Linking and Relation Extraction framework. Our preliminary experiments reveal that LLMs perform encouragingly well in long-tail EL, indicating that this technology can be a valuable adjunct in filling the gap between head and long-tail EL.
Problem

Research questions and friction points this paper is trying to address.

Evaluating LLMs on long-tail entity linking in historical documents
Addressing underrepresentation of long-tail entities in training data
Comparing LLMs with traditional methods in entity linking performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs for long-tail entity linking
Comparison with ReLiK framework
Using MHERCL v0.1 benchmark
🔎 Similar Papers
No similar papers found.
M
Marta Boscariol
Department of Management, University of Turin, Italy
Luana Bulla
Luana Bulla
University of Catania
Natural Language Processing
L
Lia Draetta
Department of Computer Science, University of Turin, Italy
B
Beatrice Fiumano
Department of Modern Languages, Literatures and Cultures, University of Bologna, Italy
E
Emanuele Lenzi
Department of Information Engineering (DII), University of Pisa, Italy; Institute of Information Science and Technologies (ISTI), National Research Council of Italy (CNR), Pisa, Italy
L
Leonardo Piano
Department of Mathematics and Computer Science, University of Cagliari, Italy