LINK-KG: LLM-Driven Coreference-Resolved Knowledge Graphs for Human Smuggling Networks

πŸ“… 2025-10-30
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Human trafficking network analysis is hindered by lengthy legal documents, ambiguous coreference, and lack of structural annotation, resulting in fragmented knowledge graph construction and inconsistent entity linking. To address these challenges, this paper proposes a large language model–driven, three-stage coreference resolution framework. It introduces a novel type-aware Prompt Cache mechanism to ensure cross-textual consistency in entity referencing, and integrates modular knowledge extraction to substantially suppress node redundancy and noise. Experimental results demonstrate that, compared to baseline methods, the proposed approach reduces node duplication by 45.21% and noise nodes by 32.22%. The resulting knowledge graph exhibits significant improvements in coherence, completeness, and interpretability for criminal network analysis.

Technology Category

Application Category

πŸ“ Abstract
Human smuggling networks are complex and constantly evolving, making them difficult to analyze comprehensively. Legal case documents offer rich factual and procedural insights into these networks but are often long, unstructured, and filled with ambiguous or shifting references, posing significant challenges for automated knowledge graph (KG) construction. Existing methods either overlook coreference resolution or fail to scale beyond short text spans, leading to fragmented graphs and inconsistent entity linking. We propose LINK-KG, a modular framework that integrates a three-stage, LLM-guided coreference resolution pipeline with downstream KG extraction. At the core of our approach is a type-specific Prompt Cache, which consistently tracks and resolves references across document chunks, enabling clean and disambiguated narratives for structured knowledge graph construction from both short and long legal texts. LINK-KG reduces average node duplication by 45.21% and noisy nodes by 32.22% compared to baseline methods, resulting in cleaner and more coherent graph structures. These improvements establish LINK-KG as a strong foundation for analyzing complex criminal networks.
Problem

Research questions and friction points this paper is trying to address.

Resolving coreference ambiguities in legal case documents for smuggling networks
Constructing consistent knowledge graphs from long unstructured legal texts
Reducing entity duplication and noise in automated knowledge graph extraction
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-guided coreference resolution pipeline for knowledge graphs
Type-specific Prompt Cache tracks references across documents
Modular framework reduces node duplication and noise
πŸ”Ž Similar Papers
No similar papers found.
D
Dipak Meher
Department of Computer Science, George Mason University, Fairfax V A, USA
Carlotta Domeniconi
Carlotta Domeniconi
Professor of Computer Science, George Mason University
machine learningdata miningclusteringclassificationensemble methods
G
Guadalupe Correa-Cabrera
Schar School of Policy and Government, George Mason University, Arlington V A, USA