🤖 AI Summary
Inferring transmission chains during epidemics often yields multiple candidate transmission trees—collectively termed an “epidemiological forest”—yet no statistical framework exists to assess whether differences among such forests are significant. This paper introduces the first statistically rigorous framework for comparing epidemiological forests, integrating chi-square tests with permutation-based multivariate analysis of variance (PERMANOVA). The method is validated using simulated transmission trees and implemented in the R package *mixtree*. Results demonstrate that PERMANOVA exhibits superior sensitivity across diverse epidemic scenarios compared to the chi-square test, achieving 100% specificity when ≥100 transmission trees are analyzed. The framework enables quantitative, reproducible statistical comparison of transmission structures inferred under distinct models, data assumptions, or intervention scenarios. By providing a principled approach to evaluate and contrast outbreak reconstructions, it fills a critical methodological gap in infectious disease source attribution and model assessment.
📝 Abstract
Inferring who infected whom in an outbreak is essential for characterising transmission dynamics and guiding public health interventions. However, this task is challenging due to limited surveillance data and the complexity of immunological and social interactions. Instead of a single definitive transmission tree, epidemiologists often consider multiple plausible trees forming extit{epidemic forests}. Various inference methods and assumptions can yield different epidemic forests, yet no formal test exists to assess whether these differences are statistically significant. We propose such a framework using a chi-square test and permutational multivariate analysis of variance (PERMANOVA). We assessed each method's ability to distinguish simulated epidemic forests generated under different offspring distributions. While both methods achieved perfect specificity for forests with 100+ trees, PERMANOVA consistently outperformed the chi-square test in sensitivity across all epidemic and forest sizes. Implemented in the R package extit{mixtree}, we provide the first statistical framework to robustly compare epidemic forests.