🤖 AI Summary
Knowledge graph (KG) auto-generation often introduces noise, and existing detection methods rely on external facts, logical rules, or structural embeddings—limiting their robustness due to entity alignment errors, poor rule generalizability, and structural overfitting. To address this, we propose a fully self-supervised denoising framework that requires no external supervision, handcrafted rules, or alignment annotations. Our approach uniquely models type consistency between entities and relations as an intrinsic self-supervised signal. It employs a type-aware encoder-decoder architecture that performs topology-aware, type-dependent reasoning, and identifies noisy triples via reconstruction error. Crucially, the method implicitly supports knowledge compression and completion. Extensive experiments on multiple real-world KGs demonstrate significant improvements in noise detection accuracy. Results validate type consistency as a robust, generalizable signal for KG denoising—offering a principled alternative to conventional supervision- or rule-based paradigms.
📝 Abstract
Knowledge graphs serve as critical resources supporting intelligent systems, but they can be noisy due to imperfect automatic generation processes. Existing approaches to noise detection often rely on external facts, logical rule constraints, or structural embeddings. These methods are often challenged by imperfect entity alignment, flexible knowledge graph construction, and overfitting on structures. In this paper, we propose to exploit the consistency between entity and relation type information for noise detection, resulting a novel self-supervised knowledge graph denoising method that avoids those problems. We formalize type inconsistency noise as triples that deviate from the majority with respect to type-dependent reasoning along the topological structure. Specifically, we first extract a compact representation of a given knowledge graph via an encoder that models the type dependencies of triples. Then, the decoder reconstructs the original input knowledge graph based on the compact representation. It is worth noting that, our proposal has the potential to address the problems of knowledge graph compression and completion, although this is not our focus. For the specific task of noise detection, the discrepancy between the reconstruction results and the input knowledge graph provides an opportunity for denoising, which is facilitated by the type consistency embedded in our method. Experimental validation demonstrates the effectiveness of our approach in detecting potential noise in real-world data.