Understanding discrepancies in the coverage of OpenAlex: the case of China

📅 2025-07-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study systematically examines coverage bias in the OpenAlex database toward scholarly publications from China and other non-English-speaking countries, revealing substantial incompleteness and temporal discontinuity in its indexing of Chinese scientific literature—distorting cross-national research performance assessments. Employing bibliometric methods, we conduct multidimensional quantitative comparisons against authoritative national databases (e.g., CNKI, CSCD), providing the first empirical characterization of OpenAlex’s structural coverage imbalances across regional and linguistic dimensions. Results indicate that, despite improvements over prior open citation databases, OpenAlex covers less than 40% of Chinese journal articles published between 2010 and 2022, with pronounced fluctuations by discipline and year. This work delivers critical empirical evidence on the applicability boundaries of open citation databases and advances methodological reflection on equitable representation of non-English academic ecosystems in scientometrics.

Technology Category

Application Category

📝 Abstract
Citation indexes play a crucial role for understanding how science is produced, disseminated, and used. However, these databases often face a critical trade-off: those offering extensive and high-quality coverage are typically proprietary, whereas publicly accessible datasets frequently exhibit fragmented coverage and inconsistent data quality. OpenAlex was developed to address this challenge, providing a freely available database with broad open coverage, with a particular emphasis on non-English speaking countries. Yet, few studies have assessed the quality of the OpenAlex dataset. This paper assesses the coverage, by OpenAlex, of China's papers, which shows an abnormal trend, and compares it with other countries that do not have English as their main language. Our analysis reveals that while OpenAlex increases the coverage of China's publications, primarily those disseminated by a national database, this coverage is incomplete and discontinuous when compared to other countries' records in the database. We observe similar issues in other non-English-speaking countries, with coverage varying across regions. These findings indicate that although OpenAlex expands coverage of research outputs, continuity issues persist and disproportionately affect certain countries. We emphasize the need for researchers to use OpenAlex data cautiously, being mindful of its potential limitations in cross-national analyses.
Problem

Research questions and friction points this paper is trying to address.

Assessing OpenAlex coverage gaps for Chinese publications
Comparing OpenAlex data quality across non-English-speaking countries
Identifying regional biases in OpenAlex research output coverage
Innovation

Methods, ideas, or system contributions that make the work stand out.

OpenAlex provides free broad open coverage
Focus on non-English speaking countries
Assesses coverage quality of China's papers
🔎 Similar Papers
No similar papers found.