Diagnosing and Mitigating Semantic Inconsistencies in Wikidata's Classification Hierarchy

📅 2025-11-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Wikidata’s taxonomy suffers from semantic inconsistencies—including erroneous subclass relationships, overgeneralization, and redundant links—due to its permissive editing policy, thereby compromising knowledge graph quality. To address this, we propose a decidable criterion for assessing correction necessity—the first application of decidability to knowledge graph error correction. Our method integrates rule-based semantic validation, symbolic reasoning, and human-in-the-loop crowdsourced verification into a lightweight, scalable framework for detecting and repairing semantic inconsistencies. The system supports interactive exploration and domain-specific customization of review workflows. Evaluated across multiple domains, it effectively identifies and resolves diverse inconsistency patterns. Empirical results demonstrate substantial improvements in both the maintainability and trustworthiness of Wikidata’s classification structure.

Technology Category

Application Category

📝 Abstract
Wikidata is currently the largest open knowledge graph on the web, encompassing over 120 million entities. It integrates data from various domain-specific databases and imports a substantial amount of content from Wikipedia, while also allowing users to freely edit its content. This openness has positioned Wikidata as a central resource in knowledge graph research and has enabled convenient knowledge access for users worldwide. However, its relatively loose editorial policy has also led to a degree of taxonomic inconsistency. Building on prior work, this study proposes and applies a novel validation method to confirm the presence of classification errors, over-generalized subclass links, and redundant connections in specific domains of Wikidata. We further introduce a new evaluation criterion for determining whether such issues warrant correction and develop a system that allows users to inspect the taxonomic relationships of arbitrary Wikidata entities-leveraging the platform's crowdsourced nature to its full potential.
Problem

Research questions and friction points this paper is trying to address.

Detecting classification errors in Wikidata's taxonomic hierarchy
Identifying over-generalized subclass relationships in knowledge graphs
Addressing redundant connections in Wikidata's classification system
Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel validation method detects classification errors
New evaluation criterion determines correction necessity
System leverages crowdsourcing to inspect taxonomic relationships
🔎 Similar Papers
No similar papers found.