🤖 AI Summary
Traditional science mapping methods rely on document clustering, which is vulnerable to inherent biases in data sources, resulting in imbalanced thematic representation. This paper proposes a novel paradigm for constructing document networks from heterogeneous, multimodal data sources—including patents, policy documents, social media posts, and author metadata—to systematically mitigate thematic clustering bias and enable demand-driven, customizable science mapping. We provide the first empirical validation of non-traditional data sources’ capacity to controllably steer thematic orientation in science maps, identifying source-specific biases (e.g., Facebook’s preference for health-related topics; author affiliations reinforcing geographic entities). Evaluated across six domains—health, biotechnology, policy, food, nursing, and geography—the approach significantly improves thematic distribution balance, enhancing map utility, interpretability, and domain adaptability.
📝 Abstract
Traditional science maps visualize topics by clustering documents, but they are inherently biased toward clustering certain topics over others. If these topics could be chosen, then the science maps could be tailored for different needs. In this paper, we explore the use of document networks from diverse data sources as a tool to control the topic clustering bias of a science map. We analyze this by evaluating the clustering effectiveness of several topic categories over two traditional and six non-traditional data sources. We found that the topics favored in each non-traditional data source are about: Health for Facebook users, biotechnology for patent families, government and social issues for policy documents, food for Twitter conversations, nursing for Twitter users, and geographical entities for document authors (the favoring in this latter source was particularly strong). Our results show that diverse data sources can be used to control topic bias, which opens up the possibility of creating science maps tailored for different needs.