Tractable Maximization of Budgeted Phylogenetic Diversity on Networks Utilizing Node Scanwidth

📅 2026-05-22

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

This study addresses the NP-hard problem of maximizing phylogenetic diversity (PD) under a budget constraint in phylogenetic networks with non-uniform protection costs. The authors propose a novel parameterized algorithmic framework based on node scanwidth (nsw), which enables exact PD optimization under heterogeneous costs for the first time. They also introduce the first exact algorithm to compute nsw and its associated decomposition. By integrating dynamic programming, data reduction, and integer linear programming, the approach efficiently solves PD maximization on networks exhibiting strong tree-like structure. Experimental results demonstrate that both optimal PD scores and nsw values can be computed within seconds on highly reticulate simulated networks containing hundreds of taxa, substantially outperforming existing baseline methods that are limited to unit costs.

📝 Abstract

Identifying a subset of taxa that maximizes Phylogenetic Diversity (PD) is a cornerstone of quantitative conservation planning. Traditionally, PD is defined over a phylogenetic tree in which leaves resemble present-day taxa and the branch lengths capture the estimated evolutionary distinctiveness. While PD maximization is computationally tractable on trees with unit costs, the problem becomes NP-hard when transitioning to phylogenetic networks or to budgeted versions in which protecting taxa incurs non-homogeneous costs. In this paper, we address these two challenges by providing definitions and a comprehensive analysis of three distinct variants of budgeted PD on networks. We conduct our study through the lens of a small structural parameter, node scanwidth (nsw), which measures the "tree-likeness" of a phylogenetic network. We show that two of the considered variants can be optimized in O*(2^nsw B^2) time, where B is the budget. For the computationally harder, third variant, we provide an algorithm to compute PD scores in O*(3^nsw) time. We further contribute the first exact algorithms to compute node scanwidth, recognizing that the utility of algorithms based on nsw depends on the ability to compute nsw and its corresponding decomposition. Our approaches integrate data reduction rules, dynamic programming, and an Integer Linear Programming formulation. We validate our theoretical results through extensive experiments on highly reticulated, simulated networks containing several hundred taxa, using heterogeneous costs. Our implementation computes PD scores and optimal nsw in fractions of a second, even on the most challenging instances. Furthermore, our budgeted optimization algorithms significantly outperform existing benchmarks for computing PD on networks, which were previously limited to unit-cost scenarios. The software makes analyses even on networks with a thousand taxa tracta...

Problem

Research questions and friction points this paper is trying to address.

Phylogenetic Diversity

Budgeted Optimization

Phylogenetic Networks

NP-hard

Conservation Planning

Innovation

Methods, ideas, or system contributions that make the work stand out.

budgeted phylogenetic diversity

phylogenetic networks

node scanwidth