No document available.
Abstract :
[en] In genome-wide association studies (GWAs), population stratification may cause inflated type I errors and overly-optimistic test results, when not properly corrected for. During the past decade, several methods have been proposed for association testing in the presence of population stratification. Among these, principal components-based approaches are the most popular. Principal component analysis (PCA) allows data transformation to a new coordinate system such that the projection of the data along the first new coordinate (called the PC1) has the largest variance; the second PC has the second largest variance, and so on. In practice, two components are usually enough to adjust or to control for population stratification. They can easily be included in parametric association models as covariates. Despite the success of this strategy, there are still some caveats which need further attention. Among these are that principal component-based methods generally do not account for cryptic relatedness (kinship) between supposedly unrelated individuals, are not straightforwardly adapted to accommodate family-based designs or mixtures of families and unrelated individuals, and do not always take proper account of the trait under investigation.
In this work, we present an easy-to-use alternative that addresses the aforementioned issues. For quantitative traits, we propose to first use the mixed polygenic model (possibly taking into account important non-genetic confounders as covariates), second to derive “polygenic” residuals from this model – hereby removing genomic kinship relationships, and third to consider these residuals as new traits in a classical genome-wide QTL analysis for “unrelated individuals”. The polygenic component of the aforementioned mixed polygenic model describes the contribution from multiple independently segregating genes, all having a small additive effect on the trait under investigation. Via an extensive simulation study, with various settings of population stratification and admixture, we show that this approach not only removes most of the “relatedness” between individuals (cryptic relatedness or known relatedness), but also removes most of the remaining substructures caused by population stratification or admixture. As a proof of concept, we demonstrate the efficiency of this robust method to control for population stratification on real-life genome-scale data from the SNP Health Association Resource (SHARe) Asthma Resource project (SHARP) (dbGaP accession number phs000166.v2.p1). We also provide leads to extend this method to dichotomous traits.