Doctoral thesis (Dissertations and theses)
Characterization of neurodegenerative diseases with tree ensemble methods: the case of Alzheimer's disease
Wehenkel, Marie
2018
 

Files


Full Text
Thesis.pdf
Publisher postprint (15.48 MB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
Machine learning; Alzheimer's disease; CAD systems; Tree ensemble methods; Random Forests; Group selection
Abstract :
[en] For the last decade, the neuroscience field has observed the emergence of machine learning methods for the analysis of neuroimaging data. Unlike univariate methods that consider voxels one per one, these techniques analyse relationships between several voxels and are able to detect multivariate patterns. In the context of neurodegenerative diseases, such as Alzheimer’s disease (AD), they can be used to design a diagnosis system and to find in neuroimages the patterns responsible for the disease. The context of the work presented here is thus the field of pattern recognition with neuroimaging. Our objective is to explore the possibilities that tree ensemble methods, such as Random Forests, offer in this domain in general, and in particular in the context of AD research. These methods suit very well the needs of this domain, as they combine very good predictive performances and provide interpretable results in the form of variable importance scores. Our contributions include both methodological developments around tree ensemble methods and applications of these methods on real datasets. The methodological part of the thesis focuses on the analysis and the improvement of Random Forests variable importances for neuroimaging problems. Typical datasets in this domain are of very high dimensionality (hundreds of thousands of voxels) and contain comparatively very few samples (tens or hundreds of patients). Our first contribution is a theoretical and empirical analysis of how importance scores behave in such extreme settings, depending on the method parameters. We then propose several improvements of importance scores in such settings that take advantage of either the spatial structure between the features or a pre-defined partitioning of these features into groups. Finally, we address an issue with Random Forests importances, which is to find a threshold between truly relevant and irrelevant variables. For this purpose, we adapt several statistical methods proposed in the bioinformatics literature. These methods are extended to compute a statistical score for groups of features instead of individual features. This adaptation at the group level has been raised from our expectation to find groups of voxels explaining a disease instead of isolated voxels. We show that working at the group level leads to a higher statistical power than working at the feature level. The approach is applied on a real dataset for the prognosis of AD, where it is shown to highlight brain regions that are consistent with results in the literature. In the second part of the thesis, we show different applications of Random Forests for AD research. First, we use tree-based ensemble methods in order to clinically characterize two different metabolic profiles observed in PET scans of AD patients. Second, we carry out an empirical comparison that shows that Random Forests are competitive with linear methods, in terms of accuracy and interpretability, on different real datasets related to three research questions about AD: the diagnosis of demented patients, the prognosis of mild cognitively impaired (MCI) patients, and the differentiation of MCI and AD patients.
Research Center/Unit :
GIGA CRC (Cyclotron Research Center) In vivo Imaging-Aging & Memory - ULiège
Disciplines :
Engineering, computing & technology: Multidisciplinary, general & others
Author, co-author :
Wehenkel, Marie ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
Characterization of neurodegenerative diseases with tree ensemble methods: the case of Alzheimer's disease
Defense date :
17 September 2018
Number of pages :
146 + 25
Institution :
ULiège - Université de Liège
Degree :
Docteur en Sciences de l'ingénieur
Promotor :
Phillips, Christophe  ;  Université de Liège - ULiège > GIGA > GIGA CRC In vivo Imaging - Neuroimaging, data acquisition and processing
Geurts, Pierre  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
President :
Ernst, Damien  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Jury member :
Bastin, Christine  ;  Université de Liège - ULiège > GIGA > GIGA CRC In vivo Imaging - Aging & Memory
Louppe, Gilles  ;  Université de Liège - ULiège > Département d'électricité, électronique et informatique (Institut Montefiore) > Big Data
Saeys, Yvan
Bzdok, Danilo
Funders :
F.R.S.-FNRS - Fonds de la Recherche Scientifique
Available on ORBi :
since 14 September 2018

Statistics


Number of views
517 (69 by ULiège)
Number of downloads
539 (38 by ULiège)

Bibliography


Similar publications



Contact ORBi