CORE-KG: An LLM-Driven Knowledge Graph Construction Framework for Human Smuggling Networks

📅 2025-06-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address challenges in legal documents—including non-structural formatting, ambiguous coreference, entity duplication, and relational noise—this paper proposes an automated knowledge graph construction method tailored to human trafficking networks. We introduce a novel type-aware, stepwise large language model (LLM)-based coreference resolution mechanism, integrated with a domain-guided graph-structured extraction paradigm to eliminate reliance on static templates and mitigate hallucination-induced noise. Our approach enhances the GraphRAG framework through structured prompt engineering, domain-specific instruction tuning, and a two-stage LLM pipeline (coreference resolution followed by relation extraction). Experimental results demonstrate a 33.28% reduction in node duplication and a 38.37% decrease in legal text noise, significantly improving graph consistency, interpretability, and analytical utility for downstream forensic and investigative tasks.

Technology Category

Application Category

📝 Abstract
Human smuggling networks are increasingly adaptive and difficult to analyze. Legal case documents offer valuable insights but are unstructured, lexically dense, and filled with ambiguous or shifting references-posing challenges for automated knowledge graph (KG) construction. Existing KG methods often rely on static templates and lack coreference resolution, while recent LLM-based approaches frequently produce noisy, fragmented graphs due to hallucinations, and duplicate nodes caused by a lack of guided extraction. We propose CORE-KG, a modular framework for building interpretable KGs from legal texts. It uses a two-step pipeline: (1) type-aware coreference resolution via sequential, structured LLM prompts, and (2) entity and relationship extraction using domain-guided instructions, built on an adapted GraphRAG framework. CORE-KG reduces node duplication by 33.28%, and legal noise by 38.37% compared to a GraphRAG-based baseline-resulting in cleaner and more coherent graph structures. These improvements make CORE-KG a strong foundation for analyzing complex criminal networks.
Problem

Research questions and friction points this paper is trying to address.

Analyzing adaptive human smuggling networks from unstructured legal texts
Overcoming noisy, fragmented knowledge graphs from LLM-based methods
Reducing node duplication and noise in knowledge graph construction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Type-aware coreference resolution via structured LLM prompts
Domain-guided entity and relationship extraction
Adapted GraphRAG framework for cleaner graphs
🔎 Similar Papers
No similar papers found.
D
Dipak Meher
George Mason University, Fairfax, USA
Carlotta Domeniconi
Carlotta Domeniconi
Professor of Computer Science, George Mason University
machine learningdata miningclusteringclassificationensemble methods
G
Guadalupe Correa-Cabrera
George Mason University, Fairfax, USA