🤖 AI Summary
Infectious and immune-mediated disease (IID) data are fragmented across disparate sources, lack standardized metadata schemas, and suffer from poor discoverability and reusability. Method: We developed the first unified metadata search platform specifically for the IID domain, harmonizing over 4 million dataset-level metadata records from 400+ specialized and general-purpose databases. Through format normalization, semantic integration, and construction of a domain-specific ontology, the platform enables natural-language search, predefined queries, faceted browsing, and programmatic API access. Contribution/Results: This work represents the first systematic, cross-source metadata aggregation and interoperability framework for IID data, substantially enhancing Findability, Accessibility, Interoperability, and Reusability (FAIRness). The platform is actively supporting NIH/NIAID-funded projects and global researchers in hypothesis-driven analysis, cross-cohort comparison, and secondary analysis of public datasets—thereby increasing the scientific return on investment in biomedical data infrastructure.
📝 Abstract
The NIAID Data Ecosystem Discovery Portal (https://data.niaid.nih.gov) provides a unified search interface for over 4 million datasets relevant to infectious and immune-mediated disease (IID) research. Integrating metadata from domain-specific and generalist repositories, the Portal enables researchers to identify and access datasets using user-friendly filters or advanced queries, without requiring technical expertise. The Portal supports discovery of a wide range of resources, including epidemiological, clinical, and multi-omic datasets, and is designed to accommodate exploratory browsing and precise searches. The Portal provides filters, prebuilt queries, and dataset collections to simplify the discovery process for users. The Portal additionally provides documentation and an API for programmatic access to harmonized metadata. By easing access barriers to important biomedical datasets, the NIAID Data Ecosystem Discovery Portal serves as an entry point for researchers working to understand, diagnose, or treat IID.
IMPORTANCE
Valuable datasets are often overlooked because they are difficult to locate. The NIAID Data Ecosystem Discovery Portal fills this gap by providing a centralized, searchable interface that empowers users with varying levels of technical expertise to find and reuse data. By standardizing key metadata fields and harmonizing heterogeneous formats, the Portal improves data findability, accessibility, and reusability. This resource supports hypothesis generation, comparative analysis, and secondary use of public data by the IID research community, including those funded by NIAID. The Portal supports data sharing by standardizing metadata and linking to source repositories, and maximizes the impact of public investment in research data by supporting scientific advancement via secondary use.