🤖 AI Summary
Traditional principal coordinates analysis (PCoA) struggles to identify microbial taxa driving β-diversity ordinations, limiting its interpretability. This work proposes Bayesian sparse PCoA (BSPCoA), which for the first time integrates Bayesian sparse modeling with PCoA by approximating principal axes through sparse linear surrogates and introducing a delta-tolerance diagnostic to assess approximation fidelity. The method employs a three-parameter beta-normal global–local prior to induce row-wise sparsity, accommodates non-Euclidean distances such as Bray–Curtis, and preserves ordination geometry while enabling taxon-level interpretability and quantification of posterior uncertainty; it reduces to sparse PCA under Euclidean distances. Simulations and empirical analysis of the Hadza gut microbiome demonstrate that BSPCoA accurately recovers PCoA geometry and identifies a parsimonious set of key taxa associated with seasonal variation.
📝 Abstract
Principal coordinates analysis (PCoA) is a standard exploratory tool for microbiome beta-diversity studies, but its axes are defined by pairwise dissimilarities and therefore do not directly identify the taxa driving an ordination. We propose Bayesian sparse principal coordinates analysis (BSPCoA), a post hoc framework that approximates the leading principal coordinates by a sparse linear surrogate in the observed taxa. A delta-tolerance diagnostic quantifies the discrepancy between the classical ordination and its best linear surrogate, clarifying when taxon-level interpretation is well supported. We place three-parameter beta normal global-local priors on the surrogate coefficients to induce row sparsity, obtain posterior uncertainty, and select influential taxa. The method reduces to sparse principal component analysis under Euclidean distance, while remaining applicable to ecologically meaningful dissimilarities such as Bray--Curtis and Hellinger distances. We conduct simulation studies to demonstrate that BSPCoA provides an approximately linear representation of the dominant ordination geometry while enhancing interpretability in sparse microbiome settings. In the Hadza gut microbiome data, the method produces an ordination close to that of classical PCoA while highlighting a parsimonious set of taxa associated with seasonal variation.