🤖 AI Summary
To address challenges in legal documents—including non-structural formatting, ambiguous coreference, entity duplication, and relational noise—this paper proposes an automated knowledge graph construction method tailored to human trafficking networks. We introduce a novel type-aware, stepwise large language model (LLM)-based coreference resolution mechanism, integrated with a domain-guided graph-structured extraction paradigm to eliminate reliance on static templates and mitigate hallucination-induced noise. Our approach enhances the GraphRAG framework through structured prompt engineering, domain-specific instruction tuning, and a two-stage LLM pipeline (coreference resolution followed by relation extraction). Experimental results demonstrate a 33.28% reduction in node duplication and a 38.37% decrease in legal text noise, significantly improving graph consistency, interpretability, and analytical utility for downstream forensic and investigative tasks.
📝 Abstract
Human smuggling networks are increasingly adaptive and difficult to analyze. Legal case documents offer valuable insights but are unstructured, lexically dense, and filled with ambiguous or shifting references-posing challenges for automated knowledge graph (KG) construction. Existing KG methods often rely on static templates and lack coreference resolution, while recent LLM-based approaches frequently produce noisy, fragmented graphs due to hallucinations, and duplicate nodes caused by a lack of guided extraction. We propose CORE-KG, a modular framework for building interpretable KGs from legal texts. It uses a two-step pipeline: (1) type-aware coreference resolution via sequential, structured LLM prompts, and (2) entity and relationship extraction using domain-guided instructions, built on an adapted GraphRAG framework. CORE-KG reduces node duplication by 33.28%, and legal noise by 38.37% compared to a GraphRAG-based baseline-resulting in cleaner and more coherent graph structures. These improvements make CORE-KG a strong foundation for analyzing complex criminal networks.