AERO: An autonomous platform for continuous research

📅 2025-05-23
📈 Citations: 1
Influential: 0
📄 PDF

career value

203K/year
🤖 AI Summary
The COVID-19 pandemic revealed critical deficiencies in conventional public health data platforms—particularly regarding automation, continuity, and cross-sector collaboration. To address these gaps, we propose and implement an autonomous platform designed for continuous research, featuring the first end-to-end automated closed-loop architecture that integrates dynamic data governance and multi-stakeholder collaborative governance. The platform leverages Globus for secure, trusted data transfer and identity management, and GitHub for workflow versioning and CI/CD-driven automated execution. It supports fully automated acquisition, validation, transformation, analysis, and policy-governed sharing of surveillance data. Deployed in two real-world public health monitoring scenarios, the system demonstrates operational efficacy; scalability is validated via synthetic workload benchmarking. All system designs, source code, and experimental resources are openly released to ensure full reproducibility.

Technology Category

Application Category

📝 Abstract
The COVID-19 pandemic highlighted the need for new data infrastructure, as epidemiologists and public health workers raced to harness rapidly evolving data, analytics, and infrastructure in support of cross-sector investigations. To meet this need, we developed AERO, an automated research and data sharing platform for continuous, distributed, and multi-disciplinary collaboration. In this paper, we describe the AERO design and how it supports the automatic ingestion, validation, and transformation of monitored data into a form suitable for analysis; the automated execution of analyses on this data; and the sharing of data among different entities. We also describe how our AERO implementation leverages capabilities provided by the Globus platform and GitHub for automation, distributed execution, data sharing, and authentication. We present results obtained with an instance of AERO running two public health surveillance applications and demonstrate benchmarking results with a synthetic application, all of which are publicly available for testing.
Problem

Research questions and friction points this paper is trying to address.

Develops AERO platform for continuous pandemic data collaboration
Automates data ingestion, validation, and analysis for epidemiology
Enables secure multi-entity data sharing using Globus and GitHub
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated data ingestion, validation, and transformation
Automated execution of analyses on monitored data
Distributed data sharing using Globus and GitHub
🔎 Similar Papers
💼 Related Jobs
AI Data Engineer--LLMs / Agentic Systems
Pfizer
The annual base salary for this position ranges from $106,000.00 to $176,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 15.0% of the base salary and eligibility to participate in our share based long term incentive program. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
United States - Massachusetts - Cambridge
V
Valérie Hayot-Sasson
University of Chicago, Chicago, Illinois, USA. Argonne National Laboratory, Lemont, Illinois, USA.
A
Abby Stevens
Argonne National Laboratory, Lemont, Illinois, USA.
N
Nicholson T. Collier
Argonne National Laboratory, Lemont, Illinois, USA.
S
Sudershan Sridhar
Globus, Chicago, Illinois, USA.
K
Kyle Conroy
University of Chicago, Chicago, Illinois, USA. Argonne National Laboratory, Lemont, Illinois, USA.
J
J. G. Pauloski
University of Chicago, Chicago, Illinois, USA. Argonne National Laboratory, Lemont, Illinois, USA.
Y
Y. Babuji
University of Chicago, Chicago, Illinois, USA.
M
Maxime Gonthier
University of Chicago, Chicago, Illinois, USA. Argonne National Laboratory, Lemont, Illinois, USA.
N
Nathaniel C. Hudson
University of Chicago, Chicago, Illinois, USA. Argonne National Laboratory, Lemont, Illinois, USA.
Dante D. Sánchez-Gallegos
Dante D. Sánchez-Gallegos
Universidad Carlos III de Madrid
Distributed systemscloud computing
I
Ian Foster
University of Chicago, Chicago, Illinois, USA. Argonne National Laboratory, Lemont, Illinois, USA.
J
J. Ozik
University of Chicago, Chicago, Illinois, USA. Argonne National Laboratory, Lemont, Illinois, USA.
Kyle Chard
Kyle Chard
University of Chicago and Argonne National Laboratory
computer sciencedistributed systemshigh performance computingscientific computing