🤖 AI Summary
The COVID-19 pandemic revealed critical challenges in public health, including delayed integration of multi-source data, lack of semantic interoperability, and sluggish emergency response. To address these, we developed the first FAIR-compliant, cloud-native data hub specifically designed for pandemic response. Our approach leverages ontology-driven data mapping—compatible with standards such as OMOP and LOINC—automated metadata governance, and a cross-system semantic interoperability architecture to unify, standardize, and de-identify heterogeneous data from clinical records, diagnostic testing, wearable sensors, symptom reporting, and social determinants of health. The platform has supported dozens of NIH RADx initiatives, substantially reducing data acquisition and analysis timelines. It has been formally designated by the NIH as a national core infrastructure for reusable COVID-19 data sharing and demonstrates extensibility to other disease domains.
📝 Abstract
The COVID-19 pandemic highlighted the urgent need for robust systems to enable rapid data collection, integration, and analysis for public health responses. Existing approaches often relied on disparate, non-interoperable systems, creating bottlenecks in comprehensive analyses and timely decision-making. To address these challenges, the U.S. National Institutes of Health (NIH) launched the Rapid Acceleration of Diagnostics (RADx) initiative in 2020, with the RADx Data Hub, a centralized repository for de-identified and curated COVID-19 data, as its cornerstone. The RADx Data Hub hosts diverse study data, including clinical data, testing results, smart sensor outputs, self-reported symptoms, and information on social determinants of health. Built on cloud infrastructure, the RADx Data Hub integrates metadata standards, interoperable formats, and ontology-based tools to adhere to the FAIR (Findable, Accessible, Interoperable, Reusable) principles for data sharing. Initially developed for COVID-19 research, its architecture and processes are adaptable to other scientific disciplines. This paper provides an overview of the data hosted by the RADx Data Hub and describes the platform's capabilities and architecture.