References of "Abo Alchamlat, Sinan"
     in
Bookmark and Share    
Full Text
See detailCONTRIBUTION TO EPISTASY MAPPING METHODS THROUGH THE USE OF NON-PARAMETRIC METHODOLOGY
Abo Alchamlat, Sinan ULiege

Doctoral thesis (2018)

Introduction These last years have seen the emergence of a wealth of genetic information at the molecular level. Some of the main recent breakthroughs in biology originate from this new knowledge ... [more ▼]

Introduction These last years have seen the emergence of a wealth of genetic information at the molecular level. Some of the main recent breakthroughs in biology originate from this new knowledge, allowing application of new strategies in many fields of the biological research. Although approaches targeting the association between phenotypic characteristics and DNA variations have been successful, many elements in the genetic landscape of the studied traits are still unknown and uncharacterized. A track to new findings, potentially useful for a better understanding of complex determinisms, is the detection of interactions between genomic regions affecting the traits of interest rather than single locus associations. While the detection of such interactions has been the focus of many methods, and despite some successes of these methods to solve difficult problems and to detect some of these genetic interactions, there is currently no gold standard method able to detect interactions in all situations, and the relative performances of these methods remain largely unclear. This thesis is a contribution to this field of interactions mapping:in the first study, we propose a novel approach combining K-Nearest Neighbors (KNN) and Multi Dimensional Reduction (MDR) methods for the detection of gene-gene interactions as a possible alternative to existing algorithms, especially in situations where the number of involved determinants is high. In the second study, we propose another strategy based on the principle of the aggregation of experts, where the experts would be a set of popular published methods. Results The results obtained in the first study on both simulated data and real genome-wide data demonstrate some of the features that make KNN-MDR interesting in terms of accuracy and power: in many cases, it significantly outperforms its recent competitors. More specifically, the analyses on a real large dataset demonstrate the feasibility of scans using a large number of markers, as opposed to MDR where the computer burden explodes with the number of markers (when it simply increases linearly with KNN-MDR). This might for example allow highlighting interactions between markers far apart on the genomic map (trans-interactions), while some strategies propose to restrict the scans to close-by markers (cis-interactions) or to markers with significant marginal effects to reduce the amount of computations. For the second study, we also show that aggregating methods results is a strategy with interesting features for detecting epistatic interactions. Experimental results, based again on simulated and real genome-wide data, show that the aggregated predictor can produce better performances, in terms of statistical power and false positive rates, than each individual predictor to detect genetic interactions. It is consequently a useful addition to the various methods available to tackle this complicated problem. Conclusion and Perspectives In this dissertation, we focused on investigating and developing non-parametric statistical methods aiming at the detection of genetic interactions. We have shown that our novel methods complement, and sometimes improve, existing approaches used to detect genetic interactions in simulated and real datasets. The presented methodologies (KNN-MDR and aggregation of experts) are valuable in the context of loci and interaction mapping and can enhance the understanding of the biological mechanism underlying traits of interest, including diseases. More precisely, the new knowledge gained using these methodologies can assist in the prediction of clinical diseases and can contribute to provide new therapeutic opportunities. To take further steps to these appealing perspectives, a first objective could be to implement a better version of the KNN-MDR software. The improvements could be on the overall performance of the software (optimization of the time-consuming parts of the program, parallelization), but also on the improvement of the “user-friendliness” of the program. This would involve an easier (and maybe automated) tuning of the parameters allowing an optimal detection power. These parameters include: the optimal sizes of the windows - which are dependent on the studied population, the markers density, the LD pattern, the optimal size of the neighborhoods to be considered, the pre-selection of markers in the early phase of large dataset analyses, the used distance measure or the adaptative selection scheme for the selection of markers in large studies, among others, the use of other types of genomic variants (microsatellites, copy number variations, sequencing data). Another potential track would be to use a priori information on the interactions: this could be by using the results of previous studies, or by exploiting the known information on gene networks.   [less ▲]

Detailed reference viewed: 35 (8 ULiège)