🤖 AI Summary
Wikidata’s taxonomy suffers from semantic inconsistencies—including erroneous subclass relationships, overgeneralization, and redundant links—due to its permissive editing policy, thereby compromising knowledge graph quality. To address this, we propose a decidable criterion for assessing correction necessity—the first application of decidability to knowledge graph error correction. Our method integrates rule-based semantic validation, symbolic reasoning, and human-in-the-loop crowdsourced verification into a lightweight, scalable framework for detecting and repairing semantic inconsistencies. The system supports interactive exploration and domain-specific customization of review workflows. Evaluated across multiple domains, it effectively identifies and resolves diverse inconsistency patterns. Empirical results demonstrate substantial improvements in both the maintainability and trustworthiness of Wikidata’s classification structure.
📝 Abstract
Wikidata is currently the largest open knowledge graph on the web, encompassing over 120 million entities. It integrates data from various domain-specific databases and imports a substantial amount of content from Wikipedia, while also allowing users to freely edit its content. This openness has positioned Wikidata as a central resource in knowledge graph research and has enabled convenient knowledge access for users worldwide. However, its relatively loose editorial policy has also led to a degree of taxonomic inconsistency. Building on prior work, this study proposes and applies a novel validation method to confirm the presence of classification errors, over-generalized subclass links, and redundant connections in specific domains of Wikidata. We further introduce a new evaluation criterion for determining whether such issues warrant correction and develop a system that allows users to inspect the taxonomic relationships of arbitrary Wikidata entities-leveraging the platform's crowdsourced nature to its full potential.