🤖 AI Summary
Forensic DNA mixture interpretation faces challenges of high complexity and low resolution, particularly in massively parallel sequencing (MPS) data, where distinguishing true alleles from sequencing artifacts is critical. This paper introduces the first Bayesian deconvolution framework specifically designed for MPS-based DNA mixtures. It supports modeling of known contributors (e.g., victims) and innovatively incorporates a string edit distance–based allele similarity metric to quantify the impact of sequencing errors. Hypothesis testing is performed via Bayes factors, significantly enhancing discriminatory power and statistical reliability in person-of-interest (POI) identification. Experimental evaluation demonstrates that, compared to conventional capillary electrophoresis (CE) methods and existing MPS analysis tools, our framework achieves superior accuracy and robustness—especially for multi-source, low-template, and high-noise mixtures. The approach establishes a generalizable, quantitatively rigorous paradigm for forensic DNA evidence interpretation.
📝 Abstract
Mixture interpretation is a central challenge in forensic science, where evidence often contains contributions from multiple sources. In the context of DNA analysis, biological samples recovered from crime scenes may include genetic material from several individuals, necessitating robust statistical tools to assess whether a specific person of interest (POI) is among the contributors. Methods based on capillary electrophoresis (CE) are currently in use worldwide, but offer limited resolution in complex mixtures. Advancements in massively parallel sequencing (MPS) technologies provide a richer, more detailed representation of DNA mixtures, but require new analytical strategies to fully leverage this information. In this work, we present a Bayesian framework for evaluating whether a POIs DNA is present in an MPS-based forensic sample. The model accommodates known contributors, such as the victim, and uses a novel string edit distance to quantify similarity between observed alleles and sequencing artifacts. The resulting Bayes factors enable effective discrimination between samples that do and do not contain the POIs DNA, demonstrating strong performance in both hypothesis testing and classification settings.