How many SNPs do we
expect to observe in a sample of genes? This can be computed by modeling the evolution of
a population of genes by drawing the next generation of
genes with replacement from the previous generation. Such a
scenario is known among biologists as the Wright-Fisher model of
evolution. You can interactively simulate populations under this model
here.
Notice that each gene has a single ancestor. This contrasts with
family histories, where each individual has two
ancestors, a
mother and a father. However, our genes also follow the single-parent
genealogies described by the Wright-Fisher model. Under this model, the expected number
of SNPs is given by the following simple equation [11]:

In the past we have generalized equation () for alignments
of arbitrary topology and looked at the distribution of SNPs in humans
[4]. In addition, we have analyzed the
microevolutionary implications of localized SNP patterns in the model plant
*Arabidopsis thaliana* [2].

The analysis of *A. thaliana* relied, like
much of modern population genetics, on a data structure known
as the coalescent. This describes a random genealogy
of a
sample of homologous genes [7,6]. We have
recently used the coalescent to explore the effect of sampling on the
frequency spectrum of nucleotide polymorphisms in expanding subdivided populations [].

Bernhard Haubold 2016-04-14