[en] BACKGROUND: Imputation of genotypes from low-density to higher density chips is a cost-effective method to obtain high-density genotypes for many animals, based on genotypes of only a relatively small subset of animals (reference population) on the high-density chip. Several factors influence the accuracy of imputation and our objective was to investigate the effects of the size of the reference population used for imputation and of the imputation method used and its parameters. Imputation of genotypes was carried out from 50 000 (moderate-density) to 777 000 (high-density) SNPs (single nucleotide polymorphisms). METHODS: The effect of reference population size was studied in two datasets: one with 548 and one with 1289 Holstein animals, genotyped with the Illumina BovineHD chip (777 k SNPs). A third dataset included the 548 animals genotyped with the 777 k SNP chip and 2200 animals genotyped with the Illumina BovineSNP50 chip. In each dataset, 60 animals were chosen as validation animals, for which all high-density genotypes were masked, except for the Illumina BovineSNP50 markers. Imputation was studied in a subset of six chromosomes, using the imputation software programs Beagle and DAGPHASE. RESULTS: Imputation with DAGPHASE and Beagle resulted in 1.91% and 0.87% allelic imputation error rates in the dataset with 548 high-density genotypes, when scale and shift parameters were 2.0 and 0.1, and 1.0 and 0.0, respectively. When Beagle was used alone, the imputation error rate was 0.67%. If the information obtained by Beagle was subsequently used in DAGPHASE, imputation error rates were slightly higher (0.71%). When 2200 moderate-density genotypes were added and Beagle was used alone, imputation error rates were slightly lower (0.64%). The least imputation errors were obtained with Beagle in the reference set with 1289 high-density genotypes (0.41%). CONCLUSIONS: For imputation of genotypes from the 50 k to the 777 k SNP chip, Beagle gave the lowest allelic imputation error rates. Imputation error rates decreased with increasing size of the reference population. For applications for which computing time is limiting, DAGPHASE using information from Beagle can be considered as an alternative, since it reduces computation time and increases imputation error rates only slightly.
Development and characterization of a high density SNP genotyping assay for cattle. Matukumalli LK, Lawley CT, Schnabel RD, Taylor JF, Allan MF, Heaton MP, O'Connell J, Moore SS, Smith TPL, Sonstegard TS, Van Tassell CP, PLoS ONE 2009 4 5350 10.1371/journal.pone.0005350 19390634
Imputation of genotypes from different single nucleotide polymorphism panels in dairy cattle. Druet T, Schrooten C, de Roos APW, J Dairy Sci 2010 93 5443 5454 10.3168/jds.2010-3255 20965360
A common reference population from four European Holstein populations increases reliability of genomic predictions. Lund MS, de Roos APW, de Vries AG, Druet T, Ducrocq V, Fritz S, Guillaume F, Guldbrandtsen B, Liu Z, Reents R, Schrooten C, Seefried F, Su G, Genet Sel Evol 2011 43 43 10.1186/1297-9686-43-43 22152008
Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Browning SR, Browning BL, Am J Hum Genet 2007 81 1084 1097 10.1086/521987 17924348 (Pubitemid 47580259)
A combined long-range phasing and long haplotype imputation method to impute phase for SNP genotypes. Hickey JM, Kinghorn BP, Tier B, Wilson JF, Dunstan N, van der Werf JHJ, Genet Sel Evol 2011 43 12 10.1186/1297-9686-43-12 21388557
A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. Howie BN, Donnelly P, Marchini J, PLoS Genet 2009 5 1000529 10.1371/journal.pgen.1000529 19543373
Genomic evaluations with many more genotypes. VanRaden PM, O'Connell JR, Wiggans GR, Weigel KA, Genet Sel Evol 2011 43 10 10.1186/1297-9686-43-10 21366914
A hidden Markov model combining linkage and linkage disequilibrium information for haplotype reconstruction and quantitative trait locus fine mapping. Druet T, Georges M, Genetics 2010 184 789 798 10.1534/genetics.109. 108431 20008575
Illumina Data Sheet: DNA Analysis. GoldenGateR Bovine3K Genotyping BeadChip. http://www.illumina.com/documents/products/datasheets/datasheet- bovine3K.pdf
Illumina Data Sheet: DNA Analysis. BovineLD v1.1 Genotyping BeadChip. http://www.illumina.com/documents/products/datasheets/datasheet-bovineLD.pdf
Design of a bovine low-density SNP array optimized for imputation. Boichard D, Chung H, Dassonneville R, David X, Eggen A, Fritz S, Gietzen KJ, Hayes BJ, Lawley CT, Sonstegard TS, Van Tassell CP, VanRaden PM, Viaud-Martinez KA, Wiggans GR, PLoS ONE 2012 7 34130 10.1371/journal.pone.0034130 22470530
Prediction of unobserved single nucleotide polymorphism genotypes of Jersey cattle using reference panels and population-based imputation algorithms. Weigel KA, Van Tassell CP, O'Connell JR, VanRaden PM, Wiggans GR, J Dairy Sci 2010 93 2229 2238 10.3168/jds.2009-2849 20412938
Effect of imputing markers from a low-density chip on the reliability of genomic breeding values in Holstein populations. Dassonneville R, Brøndum RF, Druet T, Fritz S, Guillaume F, Guldbrandtsen B, Lund MS, Ducrocq V, Su G, J Dairy Sci 2011 94 3679 3686 10.3168/jds.2011-4299 21700057
Imputation of genotypes with low-density chips and its effect on reliability of direct genomic values in Dutch Holstein cattle. Mulder HA, Calus MPL, Druet T, Schrooten C, J Dairy Sci 2012 95 876 889 10.3168/jds.2011-4490 22281352
Short communication: Imputation performances of 3 low-density marker panels in beef and dairy cattle. Dassonneville R, Fritz S, Ducrocq V, Boichard D, J Dairy Sci 2012 95 4136 4140 10.3168/jds.2011-5133 22720970
Illumina Data Sheet: DNA Analysis. BovineHD Genotyping BeadChip. http://www.illumina.com/documents/products/datasheets/datasheet-bovineHD.pdf
A tutorial on hidden Markov chains and selected applications in speech recognition. Rabiner LR, Proc IEEE 1989 77 275 286
An ensemble-based approach to imputation of moderate-density genotypes for genomic selection with application to Angus cattle. Sun C, Wu X-L, Weigel KA, Rosa GJM, Bauck S, Woodward BW, Schabel RD, Taylor JF, Gianola D, Genet Res 2012 94 133 150 10.1017/S001667231200033X
Improving accuracy of genomic predictions within and between dairy cattle breeds with imputed high-density single nucleotide polymorphism panels. Erbe M, Hayes BJ, Matukumalli LK, Goswami S, Bowman PJ, Reich CM, Mason BA, Goddard ME, J Dairy Sci 2012 95 4114 4129 10.3168/jds.2011-5019 22720968
Short communication: genotype imputation within and across Nordic cattle breeds. Brøndum RF, Ma P, Lund MS, Su G, J Dairy Sci 2012 95 6795 6800 10.3168/jds.2012-5585 22939787