How many SNPs do we expect to observe in a sample of genes? This can be computed by modeling the evolution of a population of diploid individuals with genes by drawing the next generation of genes with replacement from the previous generation. Such a scenario is known among biologists as the Wright-Fisher model of evolution. You can interactively simulate populations under this model here. Notice that each gene has a single ancestor. This contrasts with family histories, where each individual has two ancestors, a mother and a father. However, our genes also follow the single-parent genealogies described by the Wright-Fisher model. Under this model, the expected number of SNPs is given by the following simple equation [13]:

where
In the past we have generalized equation (1) for alignments
of arbitrary topology and looked at the distribution of SNPs in humans
[5]. In addition, we have analyzed the
microevolutionary implications of localized SNP patterns in the model plant
*Arabidopsis thaliana* [4].

The analysis of *A. thaliana* relied, like much of modern
population genetics, on a data structure known as the coalescent. This
describes a random genealogy
of a
sample of homologous genes [7,6]. We have used
the coalescent to explore the effect of sampling on the frequency
spectrum of nucleotide polymorphisms in expanding subdivided
populations [10] and to quantify historical population
size changes [9].