🤖 AI Summary
The classical Tree Containment problem in phylogenetic network analysis suffers from false negatives due to uncertainty in biological data. To address this, we propose *Soft Tree Containment*, a novel model that explicitly incorporates branch support uncertainty. For efficient solving, we introduce the *scanwidth* parameterization framework—the first of its kind—linking algorithmic complexity to structural properties of the network: the maximum out-degree Δ_T and scanwidth k, thereby overcoming the limitations of traditional treewidth-based approaches. Leveraging tree decompositions and dynamic programming, we design a fixed-parameter tractable algorithm with time complexity 2^{O(Δ_T·k·log k)}·n^{O(1)}. Our method significantly improves robustness to noisy data and enables efficient, exact containment testing on real-world phylogenetic networks with low scanwidth. This advances both theoretical foundations and practical algorithms for phylogenetic inference.
📝 Abstract
Phylogenetic networks allow modeling reticulate evolution, capturing events such as hybridization and horizontal gene transfer. A fundamental computational problem in this context is the Tree Containment problem, which asks whether a given phylogenetic network is compatible with a given phylogenetic tree. However, the classical statement of the problem is not robust to poorly supported branches in biological data, possibly leading to false negatives. In an effort to address this, a relaxed version that accounts for uncertainty, called Soft Tree Containment, has been introduced by Bentert, Malík, and Weller [SWAT'18]. We present an algorithm that solves Soft Tree Containment in $2^{O(Δ_T cdot k cdot log(k))} cdot n^{O(1)}$ time, where $k = operatorname{sw}(Γ) + Δ_N$, with $Δ_T$ and $Δ_N$ denoting the maximum out-degrees in the tree and the network, respectively, and $operatorname{sw}(Γ)$ denoting the ``scanwidth'' [Berry, Scornavacca, and Weller, SOFSEM'20] of a given tree extension of the network, while $n$ is the input size. Our approach leverages the fact that phylogenetic networks encountered in practice often exhibit low scanwidth, making the problem more tractable.