Detecting Evolutionary Change-Points with Branch-Specific Substitution Models and Shrinkage Priors

📅 2025-07-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing branch-specific substitution models for detecting selection pressure or mutation rate shifts along phylogenetic trees rely on prior knowledge of change-point locations and scale poorly to large datasets. Method: We propose a prior-free, automated change-point detection framework that integrates branch-specific evolutionary models with shrinkage priors, enabling scalable inference in high-dimensional parameter spaces. Computational efficiency is substantially improved via analytical gradient computation and Hamiltonian Monte Carlo sampling. Contribution/Results: Applied to BRCA1 gene and monkeypox virus data, our method successfully identifies dynamic shifts in selection pressure. It accelerates maximum-likelihood optimization by 90× and Bayesian inference by 360× compared to standard approaches. To our knowledge, this is the first end-to-end, fully automated method for detecting mutation pattern change-points on phylogenies—achieving both statistical rigor and computational scalability.

Technology Category

Application Category

📝 Abstract
Branch-specific substitution models are popular for detecting evolutionary change-points, such as shifts in selective pressure. However, applying such models typically requires prior knowledge of change-point locations on the phylogeny or faces scalability issues with large data sets. To address both limitations, we integrate branch-specific substitution models with shrinkage priors to automatically identify change-points without prior knowledge, while simultaneously estimating distinct substitution parameters for each branch. To enable tractable inference under this high-dimensional model, we develop an analytical gradient algorithm for the branch-specific substitution parameters where the computation time is linear in the number of parameters. We apply this gradient algorithm to infer selection pressure dynamics in the evolution of the BRCA1 gene in primates and mutational dynamics in viral sequences from the recent mpox epidemic. Our novel algorithm enhances inference efficiency, achieving up to a 90-fold speedup per iteration in maximum-likelihood optimization when compared to central difference numerical gradient method and up to a 360-fold improvement in computational performance within a Bayesian framework using Hamiltonian Monte Carlo sampler compared to conventional univariate random walk sampler.
Problem

Research questions and friction points this paper is trying to address.

Detect evolutionary change-points without prior location knowledge
Overcome scalability issues in large phylogenetic datasets
Improve computational efficiency in substitution model inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrate branch-specific models with shrinkage priors
Develop analytical gradient algorithm for efficiency
Achieve significant speedup in optimization performance
🔎 Similar Papers
No similar papers found.
X
Xiang Ji
Department of Mathematics, School of Science and Engineering, Tulane University, New Orleans, LA, USA
B
Benjamin Redelings
Department of Mathematics, School of Science and Engineering, Tulane University, New Orleans, LA, USA
S
Shuo Su
Shanghai Institute of Infectious Disease and Biosecurity, School of Public Health, Fudan University, Shanghai, China
H
Hongcun Bao
Department of Biochemistry and Molecular Biology, Tulane University, New Orleans, LA, USA
W
Wu-Min Deng
Department of Biochemistry and Molecular Biology, Tulane University, New Orleans, LA, USA
S
Samuel L. Hong
Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Leuven, Belgium
Guy Baele
Guy Baele
Associate Professor, KU Leuven
PhylogeneticsMolecular EvolutionComputational BiologyVirologyBayesian Inference
Philippe Lemey
Philippe Lemey
Full Professor, KU Leuven
Computational biologyvirologyevolutionphylogenetics
Marc A. Suchard
Marc A. Suchard
Professor, University of California, Los Angeles
applied probabilitybiomedical informaticscomputational statisticsmathematical biology