🤖 AI Summary
Inconsistent document-type definitions across academic databases hinder cross-database comparability of bibliometric indicators, limiting their utility in research evaluation. This study presents the first systematic comparison of document-type classification schemes across five major platforms—OpenAlex, Web of Science, Scopus, PubMed, and Semantic Scholar—employing large-scale metadata extraction, cross-database label mapping, quantitative consistency assessment, and expert validation. Results reveal substantial structural disagreement regarding the classification of “research articles,” with OpenAlex exhibiting broad coverage but coarse-grained typology, necessitating rule-based calibration for alignment. The study delineates OpenAlex’s applicability boundaries and optimization pathways for bibliometric use, establishing a methodological foundation and empirical evidence for standardizing bibliometric practices in open science contexts.
📝 Abstract
This study compares and analyses publication and document types in the following bibliographic databases: OpenAlex, Scopus, Web of Science, Semantic Scholar and PubMed. The results demonstrate that typologies can differ considerably between individual database providers. Moreover, the distinction between research and non-research texts, which is required to identify relevant documents for bibliometric analysis, can vary depending on the data source because publications are classified differently in the respective databases. The focus of this study, in addition to the cross-database comparison, is primarily on the coverage and analysis of the publication and document types contained in OpenAlex, as OpenAlex is becoming increasingly important as a free alternative to established proprietary providers for bibliometric analyses at libraries and universities.