[en] Single Nucleotide Polymorphisms (SNPs) are commonly used to capture variations between populations. Often genome-wide SNP data are pruned based on linkage disequilibrium (LD) patterns or small subsets of SNPs are selected (e.g. PCA-correlated SNPs) to reproduce the genomic structure of the complete data set. Identifying and differentiating between subpopulations using such a reduced set can become challenging, especially when similar geographic regions are involved or when spurious patterns are likely to exist.
Although PCA-based methods can resolve structure, they cannot infer ancestry. On the other hand, the structure of haplotypes in unrelated individuals can reveal useful information about genetic ancestry. Notably, haplotype composition and the pattern of LD between markers may vary between larger populations but may also play a role within more confined geographic regions. In addition, iterative pruning principal component analysis (ipPCA) has been shown to be a powerful tool to cluster subpopulations based on SNP profiles.
Despite the complexities that are associated with haplotype inference, we argue that added value can be obtained when the LD structure between SNPs is exploited in the search for relevant population strata. In this work, we propose to combine an LD-based novel haplotype encoding scheme with the ipPCA machinery to retrieve fine population substructures. The approach is compared to state-of-the-art methods in the context of population substructure and admixture analysis.
Research Center/Unit :
Systems and Modeling Unit, Montefiore Institute and Bioinformatics and Modeling, GIGA-R
Disciplines :
Life sciences: Multidisciplinary, general & others
Author, co-author :
Chaichoompu, Kridsadakorn ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Bioinformatique
Fouladi, Ramouna; University of Liege > Montefiore Institute > Systems and Modeling Unit
Wangkumhang, Pongsakorn; National Center for Genetic Engineering and Biotechnology > Genome Institute > Biostatistics and informatics Laboratory
Wilantho, Alisa; National Center for Genetic Engineering and Biotechnology > Genome Institute > Biostatistics and informatics Laboratory
Chareanchim, Wanwisa; National Center for Genetic Engineering and Biotechnology > Genome Institute > Systems and Modeling Unit
Tongsima, Sissades; National Center for Genetic Engineering and Biotechnology > Genome Institute > Biostatistics and informatics Laboratory
Sakuntabhai, Anavaj; Institut Pasteur > Functional Genetics of Infectious Diseases Unit
Van Steen, Kristel; University of Liege > Montefiore Institute > Systems and Modeling Unit
Language :
English
Title :
Haplotype information combined with iterative pruning PCA (ipPCA) to improve population clustering
Publication date :
01 April 2014
Event name :
The 42nd European Mathematical Genetics Meeting 2014
Event organizer :
Statistical Genetics and Bioinformatics Group, Cologne Center for Genomics (CCG), University of Cologne
This website uses cookies to improve user experience. Read more
Save & Close
Accept all
Decline all
Show detailsHide details
Cookie declaration
About cookies
Strictly necessary
Performance
Strictly necessary cookies allow core website functionality such as user login and account management. The website cannot be used properly without strictly necessary cookies.
This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.
Performance cookies are used to see how visitors use the website, eg. analytics cookies. Those cookies cannot be used to directly identify a certain visitor.
Used to store the attribution information, the referrer initially used to visit the website
Cookies are small text files that are placed on your computer by websites that you visit. Websites use cookies to help users navigate efficiently and perform certain functions. Cookies that are required for the website to operate properly are allowed to be set without your permission. All other cookies need to be approved before they can be set in the browser.
You can change your consent to cookie usage at any time on our Privacy Policy page.