The Age of Selection-Duality Mutation under Fluctuating Selection among Individuals (FSI)

preprint OA: closed CC-BY-NC-4.0
📄 Open PDF Full text JSON View at publisher
Full text 41,354 characters · extracted from oa-pdf · 7 sections · click to expand

Abstract

Our recent work on molecular evolution and population genetics postulated that individuals with a specific mutation exhibit a fluctuation in fitness, short for FSI (fluctuating selection among individuals), whereas the fitness effect of wildtype remains a constant. An intriguing phenomenon called selection-duality emerges, that is, a slightly beneficial mutation could be a negative selection (the substitution rate less than the mutation rate). It appears that selection-duality is bounded by two bounds: the generic neutrality where the mutation is neutral by the means of fitness on average, and the substitution neutrality where the substitution rate equals to the mutation rate. In addition, the middle point of generic neutrality and substitution neutrality is called the FSI- neutrality. An important problem is about the age profile of allele frequency, i.e., the arising timing of a mutation whose frequency in the current population is given (the allele-age problem for short). Solving this problem under selection duality would help extend the standard coalescent theory that based on strict neutrality to a more general form under selection duality. In this paper, we studied the allele-age problem under selection-duality by the first arrival time approach and the mean age approach, respectively. Since the general solution of allele-age problem under selection duality is not available, we focused on solving the problem at the substitution neutrality (the up- bound of selection duality), the FSI-neutrality (the middle-point) and the generic neutrality (the low-bound), respectively. Our analysis results in an overall picture that the mean first-arrival age of a mutation at the substitution neutrality is theoretically identical to that at the FSI-neutrality, which is numerically close to that at the generic neutrality. For illustration, we calculated the mean age of nonsynonymous mutations in the human population and demonstrated that the estimated allele-age could be overestimated considerably when the effect of FSI was neglected. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 3

Introduction

In the population genetics theory of molecular evolution, the fixed view about the selection nature of a mutation is a fundamental assumption, which postulates that any single mutation has the same fitness effect among individuals with the same genotype (Crow and Kimura 1970; Kimura 1983). For instance, a neutral mutation is selectively neutral for all individuals who carry the mutation, and so forth a deleterious or beneficial mutation. By contrast, FSI, short for fluctuating selection among individuals, refers to the phenomenon when individuals with a specific mutation exhibit a broader phenotype variation, resulting in a fitness fluctuation, whereas the fitness of wildtype remains a constant. The biological basis of the FSI of mutations can be well illustrated by the study of human geneticists have well-demonstrated that mutations frequently exhibit different effects on individuals (Riordan and Nadeau 2017; Eldar et al. 2009; Raj et al. 2010; Jensen et al. 2025). By the underlying mechanisms, FSI can be roughly classified into genetic background (Chandler et al. 2013; Mullis et al 2018), stochastic gene expression (Raj and van Oudenaarden 2007; Elowitz et al. 2002; Ozbudak et al. 2002; Maamar et al. 2007; Vu et al. 2015), incomplete penetrance (Khoury 1988; Eldar et al. 2009; Suel et al. 2007), as well as the complexity of genotype-phenotype map (Dowell et al. 2010; Lehner 2013; Taylor and Ehrenreich 2014). It appears that those categories are not mutually excluded (Raj et al. 2010). The pattern of molecular evolution and population genetics of FSI has been studied recently (Gu 2025a; 2025b). Intriguingly, a novel phenomenon called ‘selection duality’ emerges from FSI: mutations that are statistically slightly beneficial are subject to a negative selection, which would merge to the conventional strict neutrality when FSI vanishes. Gu (2025b) showed that the substitution rate tends to inversely related to the log of effective population size (𝑁𝑒) when FSI is nontrivial, and developed a statistical procedure to predict the relative strength of FSI to the 𝑁𝑒-genetic drift. Meanwhile, Gu (2025a) studied the population genetics of FSI, and in particular evaluated the effects of .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 4 FSI on sequence divergence between species and genetic diversity within a population, revealing a provocative interpretation for the McDonald-Kreitman test that differs from the neutralist-view or the selectionist-view (Hahn 2008; Kern and Hahn 2018; Jensen et al. 2019; Munoz-Gomez et al. 2021; Gu 2021; Galtier 2024; de Jong et al. 2024). An important problem is about the age profile of allele frequency, i.e., the arising timing of a mutation whose frequency in the current population is given. One may refer to the allele-age problem for short. Solving this problem under selection duality would help extend the standard coalescent theory that based on strict neutrality to a more general form under selection duality. In this paper, we studied the allele-age problem. Since the general solution under selection duality is not available, we focused on three special cases, that at the substitution neutrality (the up-bound of selection duality), the FSI-neutrality (the middle-point) and the generic neutrality (the low-bound), respectively. We implemented two approaches: the first arrival time refers to the expected generations required for the first arrival at the specified allele frequency, and the mean age approach to the average time it have reached. One may see Crow and Kimura (1970) or Ewens (2004) for the mathematical details. The aim of our research will focus on whether the analysis based on three type of neutrality can provide an overall picture about the allele- age in selection-duality. Our analysis is crucial for the study of coalescent theory. Although a deep coalescent analysis under FSI is out of the current scope, we speculate that the expectation of coalescent time under a constant population size would, when the sampling size is infinite, converge to the mean age of selection-duality mutation whose analytical form is given by the current study.

Results

The Wright-Fisher diffusion model under FSI Consider a random mating population of a monoecioys diploid organism. In a finite population, each individual produces a large number of offspring and that exactly N of those survive to maturity. Let a and A be the mutant and wild-type alleles at a particular locus, respectively, whose fitness effects are additive. The FSI model .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 5 postulates that the fitness effect of mutant a is fluctuating among individuals, whereas that of wild-type A remains a constant. Therefore, the relative fitness of genotype AA, Aa, or aa is, on average, given by 1, 1+𝑠̅ and 1+2𝑠̅, respectively; 𝑠̅ is called the mean of the selection coefficient (s) of mutant a. Let 𝑉𝑠̅ be the variance of 𝑠̅ and Var(s) be the variance of s, respectively. Noting that the number of mutant a is 2Nx, where x is the frequency of mutant a in a generation, we obtain (1) where 𝜀𝑚𝑢𝑡 2 is called the FSI-coefficient; a large values means a strong FSI and vice versa; the subscript indicates the mutation-induced FSI. Gu (2025a, 2025b) developed a diffusion model to study the Wright-Fisher model under FSI, under which the infinitesimal mean μ(x) and the infinitesimal variance σ2(x) are, respectively, given by (2) where Ne is the effective population size that inversely measures the strength of genetic drift with respect to the Wright’s sampling process in a finite population. Briefly speaking, μ(x) describes the determinative factors that may influence the gene frequency change, and σ2(x) describes the random effect of genetic drifts. In following-up analysis it is convenient to use the (adjusted) selection-FSI ratio (ρ) defined by (3) It appears that ρ>0 indicates a positive selection, or ρ<0 indicates a negative selection. Second, FSI emerges as a new resource of genetic drift, measured by the FSI-strength 2𝑁𝑒𝜀𝑚𝑢𝑡 2 . Reading the FSI-strength by 𝜀𝑚𝑢𝑡 2 ÷ ( 1 2𝑁𝑒 ), one may claim a dominant FSI- genetic drift when 2𝑁𝑒𝜀𝑚𝑢𝑡 2 >1, or a dominant 𝑁𝑒-genetic drift when 2𝑁𝑒𝜀𝑚𝑢𝑡 2 <1. It is mathematically concise to use a relative measure of FSI-strength (F), as given by .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 6 (4) such that F increase from 0 at 2𝑁𝑒𝜀𝑚𝑢𝑡 2 = 0 (no-FSI), which approaches to 1 when 2𝑁𝑒𝜀𝑚𝑢𝑡 2 ≫ 1. Substitution rate and emergence of selection duality The substitution rate (λ) plays a central role in the theory of molecular evolution. Let v be the mutation rate and N be the census population size. From the view of population, the substitution rate can be defined by the amount of new mutations per generation (2Nv) multiplied by the fixation probability of a single mutation with the initial frequency of 1/(2N), based on the assumption of rare, single de novo mutation event. Let u(p) be the fixation probability of a mutation in a finite population, with the initial frequency p. Formally, the substitution rate can be written by 𝜆 = 2𝑁𝑣 × 𝑢( 1 2𝑁). Gu (2025b) showed that (5) (One may also see Methods in details). It should be noticed that λ>v (the substitution rate greater than the mutation rate) when ρ>0, indicating a positive selection, whereas λ<v (the substitution rate less than the mutation rate) when ρ<0, indicating a negative selection. In the case of no-FSI, Eq.(5) is reduced to the well-known formula first reported by Kimura (1962). Further analysis of Eq.(5) indicates that, when 𝑠̅ > 2εmut2 (𝜌 > 0) or 𝑠̅<0 (𝜌 < −4), FSI only plays a marginal role in molecular evolution driven by a positive selection or a negative selection, respectively. However, between them, i.e., −4 < ρ < 0, an intriguing phenomenon called selection-duality emerges: a slightly beneficial mutation defined by 0 < 𝑠̅ < 2εmut2, or −4 < 𝜌 < 0 (6) .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 7 is subject to a negative selection (λ<v because of ρ<0). It appears that selection-duality defined by Eq.(6) is bounded by two types of neutralities. The low-bound is the generic neutrality (𝑠̅= 0), at which the mutation is neutral by the means of fitness. On the other hand, the up-bound of selection-duality is the substitution neutrality (𝑠̅=2εmut2), at which the substitution rate equals to the mutation rate (λ=v). The broadness of selection duality depends on the magnitude of εmut2. Without FSI, i.e., εmut2=0, the selection duality vanishes as the generic neutrality and the substitution neutrality merge onto the classical neutrality. In addition to those boundary neutralities, the middle-point of selection-duality at 𝑠̅ =εmut2 or ρ=-2, called FSI-neutrality, may play a pivotal role in the new theory of molecular evolution, as shown later. The mean age of a mutation with a given frequency: approximated by the first-arrival theory Let 𝑇̃𝑥(𝑝) be the average number of generations until the mutant reaches frequency x for the first time starting a lower frequency p. As shown in Methods, 𝑇𝑥(𝑝) can be derived by the Wright-Fisher diffusion model (Kimura and Ohta 1973). It should be noticed that, as 𝑥 → 1, 𝑇̃𝑥(𝑝) equals to the mean fixation time of a mutation 𝑇𝑓𝑖𝑥(𝑝), given the initial frequency p (Kimura and Ohta 1969). We are particularly interested in the average number of generations until the mutant reaches frequency x for the first time since its single origin with the allele frequency p=1/(2N), where N is the census population size. Therefore, the age distribution of a mutant since its origin can be approximated by 𝑝 → 0, that is, (7) There are two functions in Eq.(7). The first one is 𝑢𝑥(𝜉), the probability of a mutant to first reach frequency 𝑥 before being lost, i.e., reaching the zero-boundary, given the initial frequency 𝜉, where the condition 0 < ξ ≤ 𝑥 holds. It can be calculated by .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 8 (8) where G(ξ) in Eq.(8) is defined by (9) Apparently, as x=1, 𝑢𝑥(𝜉) becomes the well-known fixation probability. The second function is 𝜙𝑥(𝜉), the sojourn time in the interval of (0, x], given the initial frequency 𝜉 that satisfies 0 < ξ ≤ 𝑥, as given by (10) The solution of 𝑇̃𝑥 by Eq.(7) is mathematically complex; there is no analytical solution in general. Nevertheless, we are interested the (first-arrival) age of mutations at some special case of selection duality, as shown below. 𝑻̃𝒙 at substitution neutrality identical to that at FSI-neutrality but differs marginally from that at generic neutrality As discussed above, the selection-duality defined by Eq.(6) is bounded by two types of neutralities. The low-bound is the generic neutrality (𝑠̅= 0, or 𝜌 = −4), at which the mutation is neutral by the means of fitness on average. Meanwhile, the up-bound of selection-duality is the substitution neutrality (𝑠̅=2εmut2, or 𝜌 = 0), at which the substitution rate of a mutation equals to the mutation rate (λ=v). In addition to those boundary neutralities, the middle-point of selection-duality is given by 𝑠̅ =εmut2 or ρ=-2, called FSI-neutrality. Note that the analytical solution of 𝑇̃𝑥 is not available for the range of selection-duality. Instead, we try to analyze 𝑇̃𝑥 at each of three neutralities respectively so that by putting together, one can provide an overall pattern of 𝑇̃𝑥 under selection duality. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 9 Intriguingly, we found a surprising result that 𝑇̃𝑥 at the substitution neutrality is identical to that at the FSI-neutrality. While the detailed derivation can be found in Methods, a brief argument is presented here. We first calculate the production of 𝜙𝑥(𝜉)𝑢𝑥(𝜉)[1 − 𝑢𝑥(𝜉)]. It has been shown that at either substitution neutrality (𝜌 = 0) or at FSI-neutrality (𝜌 = −2), this product is given by (11) According to Eq.(7), one can immediately conclude that the (first-arrival) age of a mutation, 𝑇̃𝑥 at the substitution neutrality and that at the FSI-neutrality are the same, which is given by (12) It appears that 𝑇̃𝑥 = 0 when 𝑥 = 0, which means that a mutation with very low frequency indicates a very recent origin. On the other hand, we have 𝑇̃𝑥 = 𝑇𝑓𝑖𝑥 when 𝑥 = 1, i.e., the mean fixation time, as given by (13) which is further reduced to 4𝑁𝑒 when 𝐹 = 0, i.e., strictly neutral mutations without FSI. Fig.1 shows the plotting of the (first-arrival) age distribution (𝑇̃𝑥) against the allele frequency (𝑥) at FSI-neutrality. While 𝑇̃𝑥 increases symbolically with the increase of 𝑥, the effect of FSI is overall to reduce the age of mutations. One may wonder whether 𝑇̃𝑥 at the generic neutrality is also the same. To this end, we examined the product of 𝜙𝑥(𝜉)𝑢𝑥(𝜉)[1 − 𝑢𝑥(𝜉)] by Eq.(M6) with ρ=-2. As shown by Eq.(M10), one may concluded that the first-arrival age of a mutation at the generic neutrality differs from that at the substitution neutrality or the FSI-neutrality. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 10 Moreover, 𝑇̃𝑥 at the generic neutrality given by Eq.(M11) is analytical but tedious. Nevertheless, a direct numerical integral analysis showed that the difference of 𝑇̃𝑥 between the generic neutrality and the substitution/FSI-neutrality should be marginal (not shown). The mean age of a substitution-neutrality mutation in a finite population Another method to evaluate the age distribution of mutations is the mean generations since an allele (that now has intermediate frequency x) had a lower frequency p, i.e., 𝑝 ≤ 𝑥. With some technical modifications and refinements, I follow the approach proposed by Kimura and Ohta (1973) to derive the mean age of a mutation in a finite population in the case of substitution neutrality. Let 𝜙(𝑥, 𝑝; 𝑡) be the probabilistic density of allele frequency (𝑥) at time t, given by the initial frequency p. It follows that, 𝑇𝑖(𝑥), the i-th moment of t, (i=0, 1, 2, …) of a mutation with the gene frequency (x), given the initial frequency p, is determined by (14) Therefore, the mean age of mutation with allele frequency (𝑥) can be calculated by (15) Let 𝛹(𝑡|𝑥, 𝑝) = 𝜙(𝑥, 𝑝; 𝑡)/ ∫ 𝜙(𝑥, 𝑝; 𝑡)𝑑𝑡 ∞ 0 = 𝜙(𝑥, 𝑝; 𝑡)/𝑇0(𝑥) be the time (or allele age) -distribution conditional of the allele frequency (x) and the initial frequency (p). It follows that Eq.(15) can be written as 𝑇̅𝑎𝑔𝑒(𝑥) = ∫ 𝑡 ∞ 0 𝛹(𝑡|𝑥, 𝑝)𝑑𝑡, which can be intuitively interpreted in a conversional way. Kimura and Ohta (1973) showed that T0(x) and T1(x) satisfy the following ordinary differential equations .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 11 (16) Note that Eq.(16) must satisfy the following constraints: T0(x)< ∞ for the regularity of probabilistic density, and T1(x) < ∞ for 𝑝 ≤ 𝑥 ≤ 1, which means that a mutant with the initial frequency p can reach the frequency of x in a finite time. As shown by Methods in details, we obtain (17) In particular, we are interested in 𝑇̅𝑎𝑔𝑒(𝑥) in the case of very low initial frequency such that 𝑝 → 0; in this case 𝑇𝑓𝑖𝑥 = 𝑇𝑓𝑖𝑥(0) is given by Eq.(13). One can further verify that in the case of no-FSI, i.e., F=0, Eq.(17) is reduced to the well-known formula (18) Case study: the mean age of nonsynonymous mutations in the human population We utilized the human population genetics data of Fu et al. (2013) to carry out a preliminary analysis. Fu et al. (2013) re-sequenced about fifteen thousand genes over six thousand individuals of European American and African American ancestry and inferred the age of over one million autosomal single nucleotide variants (SNVs). Among different types of variants they studies, we focused on nonsynonymous mutation in protein-coding genes, because our previous study (Gu 2025a) has shown an intermediate FSI (𝐹 ≈ 0.5) in those mutations in the human population. It was estimated (Fu et al. 2013) that the average age of nonsynonymous mutations was about 2.1 × 104 years (European American) or about 3.1 × 104 years (African American). Those estimates were obtained under the assumption of 𝐹 = 0. By Eq.(17) and Eq.(18), we re-estimated .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 12 those ages under 𝐹 = 0.5 and obtained the average ages was about 1.0 × 104 years (European American) or about 1.4 × 104 years (African American). It appears that the average age of nonsynonymous mutations could be considerably overestimated if FSI is not considered.

Discussion

The allele age of selection-duality mutation In this work we addressed the allele-age problem under the Wright-Fisher model of FSI (fluctuating selection among individuals). Two measures are studied. The first one is the mean first-arrival age (𝑇̃𝑥) of a mutation to the frequency x. We are particularly interested in the case of selection duality, where a mutation that is slightly beneficial on average is actually subject to a negative selection. Since the general solution of 𝑇̃𝑥 in the range of selection duality is not available, we focused on 𝑇̃𝑥 at the substitution neutrality (the up-bound of selection duality), the FSI-neutrality (the middle-point) and the generic neutrality (the low-bound), respectively. The overall picture we attempted to provide is as follows [𝑇̃𝑥]𝑠𝑢𝑏𝑠𝑡𝑖𝑡𝑢𝑡𝑖𝑜𝑚 𝑛𝑒𝑢𝑡𝑟𝑎𝑙𝑖𝑡𝑦 = [𝑇̃𝑥]𝐹𝑆𝐼− 𝑛𝑒𝑢𝑡𝑟𝑎𝑙𝑖𝑡𝑦 ≈ [𝑇̃𝑥]𝑔𝑒𝑛𝑒𝑟𝑖𝑐 𝑛𝑒𝑢𝑡𝑟𝑎𝑙𝑖𝑡𝑦 (19) That is, the mean first-arrival age (𝑇̃𝑥) of a mutation at the substitution neutrality is theoretically identical to that at the FSI-neutrality, which is numerically close to that at the generic neutrality. Tentatively, we propose that the age profile of allele frequency may be virtually universal in selection-duality. On the other hand, we used the method of Kimura and Ohta (1973) to derive the mean age of a mutation with frequency x at the substitution neutrality, 𝑇̅𝑎𝑔𝑒(𝑥). One may envisage that 𝑇̅𝑎𝑔𝑒(𝑥) is also universal in selection-duality. While this claim is rational, the method we used here cannot be applied to the case of FSI-neutrality nor the generic- neutrality. It remains further study to test whether a relationship similar to Eq.(19) also .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 13 holds for 𝑇̅𝑎𝑔𝑒(𝑥). One possible approach to solving this issue is to invoke the method developed by Maruyama (1974) which is mathematically sophisticated. The coalescent theory of allele age Coalescent theory is a powerful framework in population genetics that simulates the ancestry of genes backward in time, tracing sampled DNA lineages back until they merge (coalesce) into a single common ancestor. There is an intrinsic relationship between coalescence and the mean age of a mutant given the allele frequency. Let 𝑇𝑛,𝑏 denote the age of a mutant having b copies in a sample of n genes (0 < 𝑏 < 𝑛). Griffiths and Tavaré (1998) have derived the general formulas for the mean and variance of 𝑇𝑛,𝑏, respectively. In the case of constant population, for instance, the expectation of 𝑇𝑛,𝑏 is given by (20) Griffiths and Tavaré (1998) showed that, when the sample size approaches to infinite, i.e., 𝑛 → ∞, the sample frequency 𝑏/𝑛 → 𝑥, and so the expected 𝑇𝑛,𝑏 approaches to the mean age of a strictly neutral mutant with frequency (x), that is, (21) which was first derived by Kimura and Ohta (1973), also shown by Eq.(18). An intriguing problem is how to extend the coalescent theory to the case of FSI. Our goal is to formulate the coalescent framework of selection-duality mutant under FSI such that the expected 𝑇𝑛,𝑏, denoted by E[𝑇𝑛,𝑏 ∗ ], converges to 𝑇̅𝑥 as the sample size approaches to infinite. One may speculate that it is technically very difficult and some approximations must be made. Moreover, a concept clarification is more challenging: there are three types of neutrality under FSI: substitution neutrality, FSI-neutrality and generic neutrality, which reveal distinct population genetics features such as sampling .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 14 property (Sawyer and Hartl, 1992) ; one see Gu (2025b) for a detailed analysis. At the current stage, it remains unclear for their relationship with the coalescent neutrality. As the coalescent analysis treats genealogies as random processes rather than fixed trees, it seems difficult, under FSI, to derive an elegant probabilistic framework to understand how present-day genetic diversity arose from past events such as demographic histories. The impact of FSI on coalescent-based inference will definitely be the goal of future study.

Methods

Age of a mutation with a given allele frequency: approximated by the first- arrival theory Let 𝜇(𝑥) and 𝜎2(𝑥) be, respectively, the mean and the variance of the change in one generation of the frequency of a mutant allele having frequency x. This diffusion model assumes that the stochastic process of change in gene frequency is time homogenous, that is, the mean selection coefficient of the mutant remains constant with time while it may fluctuate among individuals. The average number of generations until the mutant reaches frequency x for the first time starting a lower frequency p, denoted by 𝑇̃𝑥(𝑝), can be derived by the diffusion method. Let 𝑢𝑥(𝜉) be the probability of a mutant to first reach frequency 𝑥 before being lost, i.e., reaching the zero-boundary, given the initial frequency 𝜉, where the condition 0 < ξ ≤ 𝑥 holds. It can be calculated by (M1) where G(ξ) is defined by (M2) Meanwhile, let 𝜙𝑥(𝜉) be the sojourn time in the interval of (0, x], given the initial frequency 𝜉 that satisfies 0 < ξ ≤ 𝑥, as given by .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 15 (M3) It follows that the average number of generations until the mutant reaches frequency x for the first time starting a lower frequency p is given by (M4) The first term on the right hand of Eq.(M4) represents the average sojourn time of the mutant with the frequency between p and x, while the second term presents that of the mutation with the frequency between 0 and p. It should be noticed that Tfix(p), the mean fixation time of a mutation, given the initial frequency p (Kimura and Ohta 1969), is a special case of 𝑇𝑥(𝑝) as 𝑥 → 1, that is, (M5) where 𝑢𝑥(ξ) → 𝑢(ξ) and 𝜙𝑥(𝜉) → 𝜙(𝜉). It appears that the average number of generations until the mutant reaches frequency x for the first time since its single origin can be calculated by allowing the allele frequency p=1/(2N), where N is the census population size. In this case, the age distribution of a mutant since its origin can be approximated by 𝑝 → 0. One can show that the second term on the right hand of Eq.(M5) approaches to 0, leading to Eq.(7). (First-arrival) age of selection duality mutations The general formula According to Eqs.(2-4), one can derive the following function under the FSI model, that is, .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 16 (M6) where function A(x) is given by (M7) By some calculus analyses, one can show that the analytical solution of Eq.(7) is available only when ρ is an integer (positively or negatively or zero) except for ρ=-1. Since we are interested in the scenario of selection duality defined by the range of −4 ≤ ρ ≤ 0, we try to derive 𝑇̃𝑥 at the FSI neutrality (ρ = −2), the substitution neutrality (ρ = 0), or generic neutrality (ρ = −4), which, together, provide an overall pattern of the age distribution of selection duality mutations. Age of mutation at FSI neutrality (𝑠̅=εmut2, or ρ=-2) By Eq.(M6), three functions, 𝑢𝑥(𝜉), 𝜙𝑥(𝜉) and 𝐴(𝑥) under ρ=-2 are, respectively, are given by (M8) and so the product of those variables is then given by Eq.(11). We thus derive the (first- arrival) age distribution of mutations at FSI-neutrality by Eq.(12). Age of substitution neutrality mutation (𝑠̅=2εmut2, or ρ=0) .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 17 In this case of substitution neutrality (𝜌 = 0), we have the expressions of three variables, 𝑢𝑥(𝜉), 𝜙𝑥(𝜉) and 𝐴(𝑥) as follows (M9) In spite that each function in Eq.(M9) differs from the corresponding ones in Eq.(M8), we find that the product, 𝜙𝑥(𝜉)𝑢𝑥(𝜉)[1 − 𝑢𝑥(𝜉)] based on Eq.(M9) is precisely identical to Eq.(M8), the same product in the case of FSI-neutrality. In other words, the (first- arrival) age distribution of mutation at the substitution neutrality is identical to that at the FSI-neutrality. Age of generic mutation (𝑠̅=0, or ρ=-4) Plugging 𝜌 = −4 into Eq.(M6), we have (M10) Eq.(M10) shows that the first-arrival age of a mutation at the generic neutrality differs from that at the substitution neutrality or the FSI-neutrality. The age distribution of mutations at generic neutrality is then given by (M11) where 𝐵(𝑥) is for .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 18 (M12) The result of Eq.(M11) is tedious in algebra; yet a numerical integral analysis based on Eq.(M11) is straightforward. Besides, for the purpose of comparison, one may calculate the age distribution of mutations at a negative selection-duality (𝑠̅=0.5εmut2, or ρ=-3) analytically, with the following specifications (M13) The mean age of a substitution-neutrality mutation We first formulate this problem briefly. Let 𝑇𝑖(𝑥) be the i-th moment of t, (i=0, 1, 2, …) of a mutation with the gene frequency (x), given the initial frequency p, as given by Eq.(14). It follows that the mean age of mutation with allele frequency (𝑥) can be calculated by (M14) Kimura and Ohta (1973) showed that T0(x) and T1(x) satisfy the following ordinary differential equations (M15) .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 19 which satisfies the following constraints: T0(x)< ∞, and T1(x) < ∞ for 𝑝 ≤ 𝑥 ≤ 1. We first consider 𝑇0 (𝑥). By integrating twice the first differential equation of Eq.(M15), and rewriting 𝜎2(𝑥) by (M16) we obtain (M17) where C1 and C2 are two arbitrary constants. The constraint of T0(x)< ∞ for 𝑝 ≤ 𝑥 ≤ 1 implies that the term 𝐶1𝑥 + 𝐶2 in the numerator should be canceled with the term 1 − 𝑥 in the denominator. Without loss of generality, one may choose 𝐶2 = −𝐶1 = 𝐴/[4𝑁𝑒(1 − 𝐹)], where A is an arbitrary constant, resulting in (M18) Therefore, the differential equation of T1(x) in Eq.(M15) can be specified as follows (M19) Integrating this equation twice with respect to x, we obtain (M20) where 𝐶1 ′ and 𝐶2 ′ are arbitrary constants that can be determined as follows. The constraint of T1(x)< ∞ for 𝑝 ≤ 𝑥 ≤ 1 implies that the term 𝐶′1𝑥 + 𝐶′2 should be canceled with the term 1 − 𝑥 in the denominator. Without loss of generality, one may choose 𝐶′2 = −𝐶′1 = 𝐵, resulting in .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 20 (M21) It follows that, by Eq.(M18), the mean age of mutation under the substitution neutrality can be expressed as follows (M22) which is independent of constant A. The arbitrary constant B can be determined by the boundary when the allele frequency x approaches unity: Tage (x) should approach the average number of generations until fixation (Kimura and Ohta 1969), that is, (M23) Replacing B in Eq.(M22) by Eq.(M23), we obtain (M24) .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 21

References

Chandler et al. (2013) Does your gene need a background check? How genetic

Background

impacts the analysis of mutations, genes, and evolution. Trends Genet Crow J, Kimura M. 1970. An Introduction to Population Genetics Theory. Minneapolis (MN): Burgess Publishing Company. de Jong et al. (2024) Moderating the neutralist–selectionist debate: exactly which propositions are we debating, and which arguments are valid? Biological Reviews Dowell et al. (2010) Genotype to phenotype: a complex problem. Science 328:469. Elowitz et al. (2002) Stochastic Gene Expression in a Single Cell. Science Eldar A et al. (2009) Partial penetrance facilitates developmental evolution in bacteria. Nature 460:510–514. Ewens WJ. 2004. Mathematical Population Genetics: theoretical Introduction. Vol. 1. New York (NY): Springer. Galtier (2024) Half a Century of Controversy: The Neutralist/Selectionist Debate in Molecular Evolution. GBE R.C. Griffiths and Simon Tavaré (1988) The age of a mutation in a general coalescent tree. Communications in Statistics. Stochastic Models. Gu, X (2021) Random penetrance of mutations among individuals: a new type of genetic drift in molecular evolution. Phenomics Gu, X(2025a) Fluctuating Selection among Individuals (FSI) as a Novel Genetic Drift in Molecular Evolution. BioRxiv. Gu, X (2025b) Population Genetics of Fluctuating Selection among Individuals (FSI): a New Paradigm of Molecular Evolution. BioRxiv. Hahn MW (2008) Toward a selection theory of molecular evolution. Evolution 62:255– 265. Jensen et al. (2019) The importance of the Neutral Theory in 1968 and 50 years on: A response to Kern and Hahn 2018. Evolution 73:111–114. Jensen et al. (2025) Genetic modifiers and ascertainment drive variable expressivity of complex disorders. Cell Kern AD, Hahn MW (2018) The Neutral Theory in Light of Natural Selection. Mol Biol Evol 35:1366–1371. .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 22 Kimura M (1962) On the probability of fixation of mutant genes in a population. Genetics 47:713–719 Kimura M (1983) The Neutral Theory and Molecular Evolution. Cambridge University Press, New York Kimura M, Ohta T. 1969. The average number of generations until fixation of a mutant gene in a finite population. Genetics. 61(3): 763. doi:10.1093/genetics/61.3.763 Kimura M and Ohta T. 1973. The age of a neutral mutant persisting in a finite population. Genetics. Khoury MJ, Flanders WD, Beaty TH (1988) Penetrance in the presence of genetic susceptibility to environmental factors. Lehner B (2013) Genotype to phenotype: lessons from model organisms for human genetics. Nat Rev Genet 14:168–178. Maamar H, Raj A, Dubnau D (2007) Noise in gene expression determines cell fate in Bacillus subtilis. Science 317 Mullis et al (2018) The complex underpinnings of genetic background effects. Nat Commun 9:3548. Munoz-Gomez et al (2021). Constructive neutral evolution 20 years later. JME Ozbudak et al. (2002) Regulation of noise in the expression of a single gene. Nat Genet Raj A et al. (2010) Variability in gene expression underlies incomplete penetrance. Nature 463:913–918. Raj A, van Oudenaarden A (2008) Nature, nurture, or chance: stochastic gene expression and its consequences. Cell Riordan JD, Nadeau JH (2017) From Peas to Disease: Modifier Genes, Network Resilience, and the Genetics of Health. Am J Hum Genet 101:177–191. Sawyer, S. A., and D. L. Hartl, 1992 Population genetics of polymorphism and divergence. Genetics 132: 1161–1176. Süel (2007) Tunability and noise dependence in differentiation dynamics. Science Taylor MB, Ehrenreich IM (2014) Genetic interactions involving five or more genes contribute to a complex trait in yeast. PLoS Genet Vu V et al (2015) Natural Variation in Gene Expression Modulates the Severity of Mutant Phenotypes. Cell .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint 23 Fig.1. The first-arrival age of alleles plotting against the allele frequency under different levels of FSI .CC-BY-NC 4.0 International licenseavailable under a (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprintthis version posted February 2, 2026. ; https://doi.org/10.64898/2026.01.30.701161doi: bioRxiv preprint

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-pdf

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-23T02:00:01.238055+00:00
License: CC-BY-NC-4.0