🤖 AI Summary
Traditional F<sub>ST</sub>, relying on variance, struggles to distinguish rare from common allelic variants. To address this, we propose F<sub>q</sub>, a generalized genetic differentiation statistic grounded in Tsallis entropy. By tuning the nonextensivity parameter *q*, F<sub>q</sub> differentially weights the allele frequency spectrum: low *q* enhances sensitivity to rare variants, while high *q* emphasizes common ones, enabling fine-grained resolution of population structure heterogeneity. Framed within information theory, F<sub>q</sub> provides a unified generalization of classical F<sub>ST</sub>, markedly improving inference resolution for historical demographic events—including isolation–migration timing and founder effects. Validated on 865 Oceanian genomes and simulated genealogies using One-vs-Rest and Leave-One-Out strategies, F<sub>q</sub> accurately identifies subpopulations driving regional differentiation and achieves high-sensitivity temporal delineation of evolutionary events.
📝 Abstract
We introduce an information-theoretic generalization of the fixation statistic, the Tsallis-order $q$ F-statistic, $F_q$, which measures the fraction of Tsallis $q$-entropy lost within subpopulations relative to the pooled population. The family nests the classical variance-based fixation index $F_{ extbf{ST}}$ at $q{=}2$ and a Shannon-entropy analogue at $q{=}1$, whose absolute form equals the mutual information between alleles and population labels. By varying $q$, $F_q$ acts as a spectral differentiator that up-weights rare variants at low $q$, while $q{>}1$ increasingly emphasizes common variants, providing a more fine-grained view of differentiation than $F_{ extbf{ST}}$ when allele-frequency spectra are skewed. On real data (865 Oceanian genomes with 1,823,000 sites) and controlled genealogical simulations (seeded from 1,432 founders from HGDP and 1000 Genomes panels, with 322,216 sites), we show that $F_q$ in One-vs-Rest (OVR) and Leave-One-Out (LOO) modes provides clear attribution of which subpopulations drive regional structure, and sensitively timestamps isolation-migration events and founder effects. $F_q$ serves as finer-resolution complement for simulation audits and population-structure summaries.