CUED Publications database

Robust estimation of local genetic ancestry in admixed populations using a nonparametric Bayesian approach.

Sohn, K-A and Ghahramani, Z and Xing, EP (2012) Robust estimation of local genetic ancestry in admixed populations using a nonparametric Bayesian approach. Genetics, 191. pp. 1295-1308.

Full text not available from this repository.


We present a new haplotype-based approach for inferring local genetic ancestry of individuals in an admixed population. Most existing approaches for local ancestry estimation ignore the latent genetic relatedness between ancestral populations and treat them as independent. In this article, we exploit such information by building an inheritance model that describes both the ancestral populations and the admixed population jointly in a unified framework. Based on an assumption that the common hypothetical founder haplotypes give rise to both the ancestral and the admixed population haplotypes, we employ an infinite hidden Markov model to characterize each ancestral population and further extend it to generate the admixed population. Through an effective utilization of the population structural information under a principled nonparametric Bayesian framework, the resulting model is significantly less sensitive to the choice and the amount of training data for ancestral populations than state-of-the-art algorithms. We also improve the robustness under deviation from common modeling assumptions by incorporating population-specific scale parameters that allow variable recombination rates in different populations. Our method is applicable to an admixed population from an arbitrary number of ancestral populations and also performs competitively in terms of spurious ancestry proportions under a general multiway admixture assumption. We validate the proposed method by simulation under various admixing scenarios and present empirical analysis results from a worldwide-distributed dataset from the Human Genome Diversity Project.

Item Type: Article
Uncontrolled Keywords: Algorithms Bayes Theorem Computer Simulation Genetics, Population Genome, Human Haplotypes Human Genome Project Humans Markov Chains Models, Genetic Models, Statistical Mutation Rate Reproducibility of Results
Divisions: Div F > Computational and Biological Learning
Depositing User: Cron Job
Date Deposited: 17 Jul 2017 19:05
Last Modified: 17 Jul 2018 06:09