🤖 AI Summary
To address the scalability bottleneck in high-dimensional phylogenetic tree inference—caused by super-exponential growth of the posterior space—the paper introduces the first combinatorial and nested combinatorial sequential Monte Carlo (CSMC/NC-SMC) frameworks, along with a variational inference extension, specifically designed for hyperbolic space. Leveraging the intrinsic compatibility of hyperbolic geometry with hierarchical structures, the method enables geometrically adaptive modeling of tree topologies. It constitutes the first unbiased, scalable Bayesian inference algorithm formulated directly on a Lie group—namely, hyperbolic space. Experiments on both synthetic and real-world datasets demonstrate substantial improvements: inference speed increases up to 2.3×, posterior estimation accuracy improves markedly (KL divergence reduced by 37%), and scalability extends to trees with over 10⁴ leaves—outperforming Euclidean-space baselines across all metrics.
📝 Abstract
Hyperbolic space naturally encodes hierarchical structures such as phylogenies (binary trees), where inward-bending geodesics reflect paths through least common ancestors, and the exponential growth of neighborhoods mirrors the super-exponential scaling of topologies. This scaling challenge limits the efficiency of Euclidean-based approximate inference methods. Motivated by the geometric connections between trees and hyperbolic space, we develop novel hyperbolic extensions of two sequential search algorithms: Combinatorial and Nested Combinatorial Sequential Monte Carlo ( extsc{Csmc} and extsc{Ncsmc}). Our approach introduces consistent and unbiased estimators, along with variational inference methods ( extsc{H-Vcsmc} and extsc{H-Vncsmc}), which outperform their Euclidean counterparts. Empirical results demonstrate improved speed, scalability and performance in high-dimensional phylogenetic inference tasks.