A decentralized future for the open-science databases

📅 2025-09-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Centralized biological data repositories face single-point failure risks—including cyberattacks, natural disasters, and governance or funding disruptions—jeopardizing data availability, integrity, and research continuity. To address this, we propose a hybrid scientific data infrastructure integrating federated architecture with decentralized technologies. Our approach employs distributed storage, cross-domain federated governance, and on-chain/off-chain协同 mechanisms for data integrity verification. The resulting framework enhances resilience and governance equity while adhering to FAIR principles. It significantly reduces dependence on central authorities, promotes fairer global data sovereignty distribution, and improves long-term sustainability. Empirical evaluation demonstrates robust fault tolerance, scalable interoperability across heterogeneous domains, and verifiable provenance tracking. This infrastructure provides a resilient foundation for open science, enabling trustworthy, persistent, and collaboratively governed biological data stewardship.

Technology Category

Application Category

📝 Abstract
Continuous and reliable access to curated biological data repositories is indispensable for accelerating rigorous scientific inquiry and fostering reproducible research. Centralized repositories, though widely used, are vulnerable to single points of failure arising from cyberattacks, technical faults, natural disasters, or funding and political uncertainties. This can lead to widespread data unavailability, data loss, integrity compromises, and substantial delays in critical research, ultimately impeding scientific progress. Centralizing essential scientific resources in a single geopolitical or institutional hub is inherently dangerous, as any disruption can paralyze diverse ongoing research. The rapid acceleration of data generation, combined with an increasingly volatile global landscape, necessitates a critical re-evaluation of the sustainability of centralized models. Implementing federated and decentralized architectures presents a compelling and future-oriented pathway to substantially strengthen the resilience of scientific data infrastructures, thereby mitigating vulnerabilities and ensuring the long-term integrity of data. Here, we examine the structural limitations of centralized repositories, evaluate federated and decentralized models, and propose a hybrid framework for resilient, FAIR, and sustainable scientific data stewardship. Such an approach offers a significant reduction in exposure to governance instability, infrastructural fragility, and funding volatility, and also fosters fairness and global accessibility. The future of open science depends on integrating these complementary approaches to establish a globally distributed, economically sustainable, and institutionally robust infrastructure that safeguards scientific data as a public good, further ensuring continued accessibility, interoperability, and preservation for generations to come.
Problem

Research questions and friction points this paper is trying to address.

Centralized scientific databases face single points of failure risks
Current repositories are vulnerable to cyberattacks, funding issues, and disasters
Centralized models threaten long-term data integrity and global research access
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes hybrid federated decentralized architecture
Enhances resilience against centralized vulnerabilities
Ensures FAIR sustainable global data stewardship
🔎 Similar Papers
No similar papers found.
G
Gaurav Sharma
Department of Biotechnology, Indian Institute of Technology Hyderabad, Sangareddy, Telangana, India 502284
V
Viorel Munteanu
Department of Computers, Informatics and Microelectronics, Technical University of Moldova, Chisinau, 2045, Moldova; Department of Biological and Morphofunctional Sciences, College of Medicine and Biological Sciences, University of Suceava, Suceava, 720229, Romania
Nika Mansouri Ghiasi
Nika Mansouri Ghiasi
ETH Zürich
Computer ArchitectureBioinformatics
J
Jineta Banerjee
SageBionetworks, Washington, USA
S
Susheel Varma
SageBionetworks, Washington, USA
L
Luca Foschini
SageBionetworks, Washington, USA
K
Kyle Ellrott
Oregon Health and Science University, Portland, Oregon
O
O. Mutlu
Department of Information Technology and Electrical Engineering, ETH Zurich, Switzerland
D
Dumitru Ciorbă
Department of Computers, Informatics and Microelectronics, Technical University of Moldova
R
R. Ophoff
V
V. Bostan
C
Christopher E. Mason
Jason H. Moore
Jason H. Moore
Chair, Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA
Artificial IntelligenceMachine LearningBiomedical InformaticsPrecision MedicineTranslational Bioinformatics
D
Despoina Sousoni
A
Arunkumar Krishnan
M
Mihai Dimian
Gustavo Stolovitzky
Gustavo Stolovitzky
Director, Biomed Data Sciences Hub, NYU Langone Health; Prof. Pathology Dep, NYU School of Medicine
Systems BiologyQuantitative BiologyGeneticsGenomicsData Science
F
F. G. Liberante
Taras K. Oleksyk
Taras K. Oleksyk
Biological Sciences, Oakland University
genomicsconservation geneticsevolution
S
S. Mangul