A statistical framework for comparing epidemic forests

📅 2025-11-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Inferring transmission chains during epidemics often yields multiple candidate transmission trees—collectively termed an “epidemiological forest”—yet no statistical framework exists to assess whether differences among such forests are significant. This paper introduces the first statistically rigorous framework for comparing epidemiological forests, integrating chi-square tests with permutation-based multivariate analysis of variance (PERMANOVA). The method is validated using simulated transmission trees and implemented in the R package *mixtree*. Results demonstrate that PERMANOVA exhibits superior sensitivity across diverse epidemic scenarios compared to the chi-square test, achieving 100% specificity when ≥100 transmission trees are analyzed. The framework enables quantitative, reproducible statistical comparison of transmission structures inferred under distinct models, data assumptions, or intervention scenarios. By providing a principled approach to evaluate and contrast outbreak reconstructions, it fills a critical methodological gap in infectious disease source attribution and model assessment.

Technology Category

Application Category

📝 Abstract
Inferring who infected whom in an outbreak is essential for characterising transmission dynamics and guiding public health interventions. However, this task is challenging due to limited surveillance data and the complexity of immunological and social interactions. Instead of a single definitive transmission tree, epidemiologists often consider multiple plausible trees forming extit{epidemic forests}. Various inference methods and assumptions can yield different epidemic forests, yet no formal test exists to assess whether these differences are statistically significant. We propose such a framework using a chi-square test and permutational multivariate analysis of variance (PERMANOVA). We assessed each method's ability to distinguish simulated epidemic forests generated under different offspring distributions. While both methods achieved perfect specificity for forests with 100+ trees, PERMANOVA consistently outperformed the chi-square test in sensitivity across all epidemic and forest sizes. Implemented in the R package extit{mixtree}, we provide the first statistical framework to robustly compare epidemic forests.
Problem

Research questions and friction points this paper is trying to address.

Inferring transmission trees during disease outbreaks with limited surveillance data
Assessing statistical significance between different epidemic forest inference methods
Developing framework to compare epidemic forests using chi-square and PERMANOVA tests
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposed chi-square test for epidemic forest comparison
Applied PERMANOVA to assess statistical significance differences
Implemented framework in R package mixtree for robustness
🔎 Similar Papers
No similar papers found.
C
Cyril Geismar
Bloomberg School of Public Health, Johns Hopkins University, Baltimore, United States
P
Peter J. White
MRC Centre for Global Infectious Disease Analysis, Imperial College School of Public Health, London, United Kingdom
Anne Cori
Anne Cori
Imperial College
infectious disease epidemiology
Thibaut Jombart
Thibaut Jombart
MRC Centre for Global Infectious Disease Analysis, Imperial College School of Public Health, London, United Kingdom