Deep Generative Models of Evolution: SNP-level Population Adaptation by Genomic Linkage Incorporation

📅 2025-07-28

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Conventional population genetic models—such as the Wright–Fisher framework—often assume locus independence and neglect linkage disequilibrium (LD) and parameter uncertainty, limiting their applicability to pooled sequencing (Pool-Seq) data under environmental selection. While deep generative models hold promise, their adoption in population genomics remains constrained by high data requirements, poor interpretability, and insufficient integration of local genomic context. Method: We introduce the first deep generative neural network tailored for Evolve-and-Resequence (E&R) experiments, jointly modeling temporal SNP allele frequency trajectories and flanking genomic sequence information to explicitly capture LD dynamics. Contribution/Results: On simulated E&R data, our approach significantly improves LD estimation accuracy—particularly in high-LD regions—overcoming the limitations of independent-site assumptions. It delivers a novel, interpretable, and high-precision paradigm for Pool-Seq analysis, enabling principled inference of selection signatures while accounting for realistic genomic architecture.

Technology Category

Application Category

📝 Abstract

The investigation of allele frequency trajectories in populations evolving under controlled environmental pressures has become a popular approach to study evolutionary processes on the molecular level. Statistical models based on well-defined evolutionary concepts can be used to validate different hypotheses about empirical observations. Despite their popularity, classic statistical models like the Wright-Fisher model suffer from simplified assumptions such as the independence of selected loci along a chromosome and uncertainty about the parameters. Deep generative neural networks offer a powerful alternative known for the integration of multivariate dependencies and noise reduction. Due to their high data demands and challenging interpretability they have, so far, not been widely considered in the area of population genomics. To address the challenges in the area of Evolve and Resequencing experiments (E&R) based on pooled sequencing (Pool-Seq) data, we introduce a deep generative neural network that aims to model a concept of evolution based on empirical observations over time. The proposed model estimates the distribution of allele frequency trajectories by embedding the observations from single nucleotide polymorphisms (SNPs) with information from neighboring loci. Evaluation on simulated E&R experiments demonstrates the model's ability to capture the distribution of allele frequency trajectories and illustrates the representational power of deep generative models on the example of linkage disequilibrium (LD) estimation. Inspecting the internally learned representations enables estimating pairwise LD, which is typically inaccessible in Pool-Seq data. Our model provides competitive LD estimation in Pool-Seq data high degree of LD when compared to existing methods.

Problem

Research questions and friction points this paper is trying to address.

Modeling SNP-level population adaptation using genomic linkage

Overcoming limitations of classic evolutionary models with deep learning

Estimating linkage disequilibrium from pooled sequencing data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep generative neural network models evolution

Integrates SNP data with neighboring loci

Estimates linkage disequilibrium in Pool-Seq data

🔎 Similar Papers

No similar papers found.