[en] Including animals in breed registries when pedigrees are incomplete, certifying breed-derived products like meat, or optimizing genomic predictions sometimes rely on breed assignment tools. Such tools are most of the time based on genotypes and usually follow three main steps: 1) Selecting informative SNPs, 2) Developing a model that allows correct breed assignment and 3) Validating this model with new genotypes, not used for selecting SNPs or developing the model. However, a wide diversity of methodologies can be applied to each of these steps. Therefore, one can wonder what the best strategy is in terms of accuracy and computing time. The objective of this study is to provide guidelines for optimizing breed assignment based on recent research. We first advise building the reference population of animals that will be used for training the model and stress the importance of including diversity representing these breeds. We then move to quality control and discuss the necessity of selecting SNPs, depending on the density of available genotypes and the chosen methodology. We suggest using machine learning techniques to develop the classification model. We also provide some advice for tuning the model using cross-validation and evaluating its performance, e.g., using balanced accuracy. Another perspective on performance is to consider computing time, not only for model development but also for in routine use. Finally, we discuss the opportunity to include crossbreds in the development of such genomic tools, aiming to avoid assigning crossbreds as purebreds.
Disciplines :
Agriculture & agronomy
Author, co-author :
Wilmot, Hélène ; Université de Liège - ULiège > Département GxABT > Animal Sciences (AS)
Lourenco, Daniela; University of Georgia > Department of Animal and Dairy Science
Gengler, Nicolas ; Université de Liège - ULiège > Département GxABT > Animal Sciences (AS)
Language :
English
Title :
Optimizing breed or line assignment based on genomic information: best practices for accurate results