Advancing Scientific Knowledge Retrieval and Reuse with a Novel Digital Library for Machine-Readable Knowledge

📅 2025-07-13
🏛️ Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Contemporary digital libraries (e.g., ACM DL, Semantic Scholar) adopt a document-centric paradigm reliant on manual or semi-automated knowledge extraction, hindering the establishment of machine-readable, fine-grained, and reproducible links among scientific claims, supporting data, and code. Method: This paper introduces the “born-reusable” paper paradigm and presents ORKG reborn—the first digital library explicitly designed for machine-driven knowledge retrieval and reuse. It integrates knowledge graphs, fine-grained semantic annotation, structured data modeling, and open metadata standards to enable computable representation and organization of multidimensional scholarly elements (e.g., methods, variables, datasets). Contribution/Results: Cross-disciplinary empirical evaluation demonstrates that ORKG reborn significantly outperforms conventional platforms in knowledge retrieval precision, reuse efficiency, and evidential traceability—thereby advancing reproducible, interoperable, and computationally actionable scholarly infrastructure.

Technology Category

Application Category

📝 Abstract
Digital libraries for research, such as the ACM Digital Library or Semantic Scholar, do not enable the machine-supported, efficient reuse of scientific knowledge (e.g., in synthesis research). This is because these libraries are based on document-centric models with narrative text knowledge expressions that require manual or semi-automated knowledge extraction, structuring, and organization. We present ORKG reborn, an emerging digital library that supports finding, accessing, and reusing accurate, fine-grained, and reproducible machine-readable expressions of scientific knowledge that relate scientific statements and their supporting evidence in terms of data and code. The rich expressions of scientific knowledge are published as reborn (born-reusable) articles and provide novel possibilities for scientific knowledge retrieval, for instance by statistical methods, software packages, variables, or data matching specific constraints. We describe the proposed system and demonstrate its practical viability and potential for information retrieval in contrast to state-of-the-art digital libraries and document-centric scholarly communication using several published articles in research fields ranging from computer science to soil science. Our work underscores the enormous potential of scientific knowledge databases and a viable approach to their construction.
Problem

Research questions and friction points this paper is trying to address.

Existing digital libraries lack machine-readable knowledge reuse capabilities
Document-centric models require manual extraction of scientific knowledge
Current systems cannot efficiently retrieve fine-grained reproducible research components
Innovation

Methods, ideas, or system contributions that make the work stand out.

Digital library for machine-readable scientific knowledge
Fine-grained reproducible expressions of scientific statements
Born-reusable articles enabling statistical constraint-based retrieval
H
Hadi Ghaemi
TIB - Leibniz Information Centre for Science and Technology
L
Lauren Snyder
TIB - Leibniz Information Centre for Science and Technology
Markus Stocker
Markus Stocker
TIB — Leibniz Information Centre for Science and Technology and Leibniz University Hannover
Knowledge InfrastructuresDigital ScholarshipNeurosymbolic AIEnvironmental Informatics