Combining digital data streams and epidemic networks for real time outbreak detection

📅 2025-11-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High noise levels in epidemiological time series and limitations of conventional surveillance hinder real-time outbreak detection. To address this, we propose LRTrend, an interpretable machine learning framework that integrates heterogeneous digital health and behavioral data from multiple sources to learn disease-specific, cross-regional epidemic networks, enabling collaborative modeling of outbreak information across regions. Unlike methods relying solely on conventional human mobility networks, LRTrend uncovers anomalous clustering patterns and transmission pathways not captured by such networks. Evaluated on two years of COVID-19 data from 305 U.S. Hospital Referral Regions, LRTrend reliably detects Delta- and Omicron-driven regional outbreaks within two weeks of onset—prior to the exponential growth phase of case counts—thereby significantly enhancing early warning capabilities.

Technology Category

Application Category

📝 Abstract
Responding to disease outbreaks requires close surveillance of their trajectories, but outbreak detection is hindered by the high noise in epidemic time series. Aggregating information across data sources has shown great denoising ability in other fields, but remains underexplored in epidemiology. Here, we present LRTrend, an interpretable machine learning framework to identify outbreaks in real time. LRTrend effectively aggregates diverse health and behavioral data streams within one region and learns disease-specific epidemic networks to aggregate information across regions. We reveal diverse epidemic clusters and connections across the United States that are not well explained by commonly used human mobility networks and may be informative for future public health coordination. We apply LRTrend to 2 years of COVID-19 data in 305 hospital referral regions and frequently detect regional Delta and Omicron waves within 2 weeks of the outbreak's start, when case counts are a small fraction of the wave's resulting peak.
Problem

Research questions and friction points this paper is trying to address.

Detecting disease outbreaks in real time from noisy epidemic time series data
Aggregating diverse health data streams within regions and across epidemic networks
Identifying regional epidemic clusters and connections for public health coordination
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combining digital data streams with epidemic networks
Aggregating health and behavioral data within regions
Learning disease-specific epidemic networks across regions
R
Ruiqi Lyu
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
A
Alistair Turcan
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
Bryan Wilder
Bryan Wilder
Assistant Professor of Machine Learning, Carnegie Mellon University
Artificial intelligenceoptimizationmachine learningsocial networks