Markov chain Monte Carlo sampling of gene genealogies conditional on genotype data from trios

For understanding genetic associations with disease outcomes, it is useful to model the latent gene genealogies, or ancestral trees, that give rise to the sample's genetic variability. To this end, we consider the genealogy of a target locus in genetic sequences from a random sample of unrelated individuals. Though the true genealogy is unknown, we can model its distribution conditional on the observed genotype data. However, sampling from the conditional distribution requires Monte Carlo methods. In this presentation, I will first describe my gene genealogy sampler, sampletrees. I will then discuss a recent extension to the sampler that assumes that the sample is comprised of sequences from related individuals, specifically from trios consisting of two parents and their child. I will apply the trio-based sampler to real data in order to illustrate its use. Finally, I will discuss how the approach could be extended for more complex family structures.