Filling in the Blanks? A Systematic Review and Theoretical Conceptualisation for Measuring WikiData Content Gaps

📅 2025-05-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the long-standing systemic content gaps in the Wikidata knowledge graph. Through a systematic literature review and theoretical modeling, we propose the first integrative, multidimensional gap typology and conceptual framework, categorizing gaps into structural, topical, linguistic, and editor-behavior–associated types, and establishing an operational measurement-to-indicator mapping system. Our framework transcends prior descriptive, empirically grounded approaches by identifying— for the first time—a previously overlooked, collaboration-mechanism–driven gap dimension, and by clarifying coverage blind spots in existing evaluation methods and their critical linkages to editor behavior. The resulting framework provides a reusable theoretical foundation and methodological toolkit for knowledge graph content quality assessment, systemic bias identification, and mitigation. (149 words)

Technology Category

Application Category

📝 Abstract
Wikidata is a collaborative knowledge graph which provides machine-readable structured data for Wikimedia projects including Wikipedia. Managed by a community of volunteers, it has grown to become the most edited Wikimedia project. However, it features a long-tail of items with limited data and a number of systematic gaps within the available content. In this paper, we present the results of a systematic literature review aimed to understand the state of these content gaps within Wikidata. We propose a typology of gaps based on prior research and contribute a theoretical framework intended to conceptualise gaps and support their measurement. We also describe the methods and metrics present used within the literature and classify them according to our framework to identify overlooked gaps that might occur in Wikidata. We then discuss the implications for collaboration and editor activity within Wikidata as well as future research directions. Our results contribute to the understanding of quality, completeness and the impact of systematic biases within Wikidata and knowledge gaps more generally.
Problem

Research questions and friction points this paper is trying to address.

Identifying and classifying content gaps in Wikidata
Developing a theoretical framework to measure Wikidata gaps
Analyzing methods for assessing Wikidata completeness and biases
Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic literature review on Wikidata gaps
Typology and framework for gap conceptualization
Methods and metrics classification for gaps
🔎 Similar Papers
No similar papers found.