🤖 AI Summary
The COVID-19 pandemic revealed critical deficiencies in conventional public health data platforms—particularly regarding automation, continuity, and cross-sector collaboration. To address these gaps, we propose and implement an autonomous platform designed for continuous research, featuring the first end-to-end automated closed-loop architecture that integrates dynamic data governance and multi-stakeholder collaborative governance. The platform leverages Globus for secure, trusted data transfer and identity management, and GitHub for workflow versioning and CI/CD-driven automated execution. It supports fully automated acquisition, validation, transformation, analysis, and policy-governed sharing of surveillance data. Deployed in two real-world public health monitoring scenarios, the system demonstrates operational efficacy; scalability is validated via synthetic workload benchmarking. All system designs, source code, and experimental resources are openly released to ensure full reproducibility.
📝 Abstract
The COVID-19 pandemic highlighted the need for new data infrastructure, as epidemiologists and public health workers raced to harness rapidly evolving data, analytics, and infrastructure in support of cross-sector investigations. To meet this need, we developed AERO, an automated research and data sharing platform for continuous, distributed, and multi-disciplinary collaboration. In this paper, we describe the AERO design and how it supports the automatic ingestion, validation, and transformation of monitored data into a form suitable for analysis; the automated execution of analyses on this data; and the sharing of data among different entities. We also describe how our AERO implementation leverages capabilities provided by the Globus platform and GitHub for automation, distributed execution, data sharing, and authentication. We present results obtained with an instance of AERO running two public health surveillance applications and demonstrate benchmarking results with a synthetic application, all of which are publicly available for testing.