AERO: An autonomous platform for continuous research

📅 2025-05-23
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
The COVID-19 pandemic revealed critical deficiencies in conventional public health data platforms—particularly regarding automation, continuity, and cross-sector collaboration. To address these gaps, we propose and implement an autonomous platform designed for continuous research, featuring the first end-to-end automated closed-loop architecture that integrates dynamic data governance and multi-stakeholder collaborative governance. The platform leverages Globus for secure, trusted data transfer and identity management, and GitHub for workflow versioning and CI/CD-driven automated execution. It supports fully automated acquisition, validation, transformation, analysis, and policy-governed sharing of surveillance data. Deployed in two real-world public health monitoring scenarios, the system demonstrates operational efficacy; scalability is validated via synthetic workload benchmarking. All system designs, source code, and experimental resources are openly released to ensure full reproducibility.

Technology Category

Application Category

📝 Abstract
The COVID-19 pandemic highlighted the need for new data infrastructure, as epidemiologists and public health workers raced to harness rapidly evolving data, analytics, and infrastructure in support of cross-sector investigations. To meet this need, we developed AERO, an automated research and data sharing platform for continuous, distributed, and multi-disciplinary collaboration. In this paper, we describe the AERO design and how it supports the automatic ingestion, validation, and transformation of monitored data into a form suitable for analysis; the automated execution of analyses on this data; and the sharing of data among different entities. We also describe how our AERO implementation leverages capabilities provided by the Globus platform and GitHub for automation, distributed execution, data sharing, and authentication. We present results obtained with an instance of AERO running two public health surveillance applications and demonstrate benchmarking results with a synthetic application, all of which are publicly available for testing.
Problem

Research questions and friction points this paper is trying to address.

Develops AERO platform for continuous pandemic data collaboration
Automates data ingestion, validation, and analysis for epidemiology
Enables secure multi-entity data sharing using Globus and GitHub
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated data ingestion, validation, and transformation
Automated execution of analyses on monitored data
Distributed data sharing using Globus and GitHub
🔎 Similar Papers
V
Valérie Hayot-Sasson
University of Chicago, Chicago, Illinois, USA. Argonne National Laboratory, Lemont, Illinois, USA.
A
Abby Stevens
Argonne National Laboratory, Lemont, Illinois, USA.
N
Nicholson T. Collier
Argonne National Laboratory, Lemont, Illinois, USA.
S
Sudershan Sridhar
Globus, Chicago, Illinois, USA.
K
Kyle Conroy
University of Chicago, Chicago, Illinois, USA. Argonne National Laboratory, Lemont, Illinois, USA.
J
J. G. Pauloski
University of Chicago, Chicago, Illinois, USA. Argonne National Laboratory, Lemont, Illinois, USA.
Y
Y. Babuji
University of Chicago, Chicago, Illinois, USA.
M
Maxime Gonthier
University of Chicago, Chicago, Illinois, USA. Argonne National Laboratory, Lemont, Illinois, USA.
N
Nathaniel C. Hudson
University of Chicago, Chicago, Illinois, USA. Argonne National Laboratory, Lemont, Illinois, USA.
Dante D. Sánchez-Gallegos
Dante D. Sánchez-Gallegos
Universidad Carlos III de Madrid
Distributed systemscloud computing
I
Ian Foster
University of Chicago, Chicago, Illinois, USA. Argonne National Laboratory, Lemont, Illinois, USA.
J
J. Ozik
University of Chicago, Chicago, Illinois, USA. Argonne National Laboratory, Lemont, Illinois, USA.
Kyle Chard
Kyle Chard
University of Chicago and Argonne National Laboratory
computer sciencedistributed systemshigh performance computingscientific computing