Machine learning; Random forests; Feature selection; Group based method; Prognosis system; FDG PET; Alzheimer's disease
Abstract :
[en] Machine learning approaches have been increasingly used in the neuroimaging field for the design of computer-aided diagnosis systems. In this paper, we focus on the ability of these methods to provide interpretable information about the brain regions that are the most informative about the disease or condition of interest. In particular, we investigate the benefit of group-based, instead of voxel-based, analyses in the context of Random forests. Assuming a prior division of the voxels into non overlapping groups (defined by an atlas), we propose several procedures to derive group importances from individual voxel importances derived from random forests models. We then adapt several permutation schemes to turn group importance scores into more interpretable statistical scores that allow to determine the truly relevant groups in the importance rankings. The good behavior of these methods is first assessed on artificial datasets. Then, they are applied on our own dataset of FDG-PET scans to identify the brain regions involved in the prognosis of Alzheimer's disease.
Research Center/Unit :
GIGA CRC (Cyclotron Research Center) In vivo Imaging-Aging & Memory - ULiège
Disciplines :
Engineering, computing & technology: Multidisciplinary, general & others
Author, co-author :
Wehenkel, Marie ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Sutera, Antonio ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique
Bastin, Christine ; Université de Liège - ULiège > Département des sciences cliniques > Neuroimagerie des troubles de la mémoire et révalid. cogn.
Geurts, Pierre ✱; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique
scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.
Bibliography
Altmann A. Toloşi L. Sander O. Lengauer T. (2010). Permutation importance: a corrected feature importance measure. Bioinformatics 26, 1340–1347. 10.1093/bioinformatics/btq13420385727
Ashburner J. Friston K. J. (1999). Nonlinear spatial normalization using basis functions. Hum. Brain Mapp. 7, 254–266. 10408769
Breiman L. (2001). Random forests. Mach. Learn. 45, 5–32. 10.1023/A:1010933404324
Breiman L. Friedman J. Stone C. J. Olshen R. A. (1984). Classification and Regression Trees. New York, NY: CRC Press.
Brookmeyer R. Johnson E. Ziegler-Graham K. Arrighi H. M. (2007). Forecasting the global burden of Alzheimer's disease. Alzheimers Dement. 3, 186–191. 10.1016/j.jalz.2007.04.38119595937
Carroll M. K. Cecchi G. A. Rish I. Garg R. Rao A. R. (2009). Prediction and interpretation of distributed neural activity with sparse models. Neuroimage 44, 112–122. 10.1016/j.neuroimage.2008.08.02018793733
Casanova R. Whitlow C. T. Wagner B. Williamson J. Shumaker S. A. Maldjian J. A. et al. (2011). High dimensional classification of structural MRI Alzheimer's disease data based on large scale regularization. Front. Neuroinform. 5:22. 10.3389/fninf.2011.0002222016732
Chételat G. Desgranges B. De La Sayette V. Viader F. Eustache F. Baron J.-C. (2003). Mild cognitive impairment. Can FDG-PET predict who is to rapidly convert to Alzheimer's disease? Neurology 60, 1374–1377. 10.1212/01.WNL.0000055847.17752.E612707450
Chételat G. Eustache F. Viader F. Sayette V. D. L. Pélerin A. Mézenge F. et al. (2005). FDG-PET measurement is more accurate than neuropsychological assessments to predict global cognitive deterioration in patients with mild cognitive impairment. Neurocase 11, 14–25. 10.1080/1355479049089693815804920
Chu C. Hsu A.-L. Chou K.-H. Bandettini P. Lin C. Alzheimer's Disease Neuroimaging Initiative et al. (2012). Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images. Neuroimage 60, 59–70. 10.1016/j.neuroimage.2011.11.06622166797
Drzezga A. Lautenschlager N. Siebner H. Riemenschneider M. Willoch F. Minoshima S. et al. (2003). Cerebral metabolic changes accompanying conversion of mild cognitive impairment into Alzheimer's disease: a PET follow-up study. Eur. J. Nuclear Med. Mol. Imaging 30, 1104–1113. 10.1007/s00259-003-1194-112764551
Dukart J. Mueller K. Horstmann A. Vogt B. Frisch S. Barthel H. et al. (2010). Differential effects of global and cerebellar normalization on detection and differentiation of dementia in FDG-PET studies. Neuroimage 49, 1490–1495. 10.1016/j.neuroimage.2009.09.01719770055
Ganz M. Greve D. N. Fischl B. Konukoglu E. Alzheimer's Disease Neuroimaging Initiative. (2015). Relevant feature set estimation with a knock-out strategy and random forests. Neuroimage 122, 131–148. 10.1016/j.neuroimage.2015.08.00626272728
Ge Y. Dudoit S. Speed T. P. (2003). Resampling-based multiple testing for microarray data analysis. Test 12, 1–77. 10.1007/BF02595811
Ge Y. Sealfon S. C. Speed T. P. (2008). Some step-down procedures controlling the false discovery rate under dependence. Stat. Sin. 18, 881–904. 19018297
Geladi P. Kowalski B. R. (1986). Partial least-squares regression: a tutorial. Anal. Chim. Acta 185, 1–17.
Geurts P. (2001). Pattern extraction for time series classification, in Proceedings of the 5th European Conference on Principles of Data Mining and Knowledge Discovery, PKDD '01 (London: Springer-Verlag), 115–127.
Geurts P. Ernst D. Wehenkel L. (2006). Extremely randomized trees. Mach. Learn. 63, 3–42. 10.1007/s10994-006-6226-1
Gray K. R. Aljabar P. Heckemann R. A. Hammers A. Rueckert D. Alzheimer's Disease Neuroimaging Initiative et al. (2013). Random forest-based similarity measures for multi-modal classification of Alzheimer's disease. Neuroimage 65, 167–175. 10.1016/j.neuroimage.2012.09.06523041336
Guyon I. Elisseeff A. (2003). An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182.
Guyon I. Weston J. Barnhill S. Vapnik V. (2002). Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422. 10.1023/A:1012487302797
Hardy J. Selkoe D. J. (2002). The amyloid hypothesis of Alzheimer's disease: progress and problems on the road to therapeutics. Science 297, 353–356. 10.1126/science.107299412130773
Hearst M. A. Dumais S. T. Osman E. Platt J. Scholkopf B. (1998). Support vector machines. IEEE Intell. Syst. Appl. 13, 18–28.
Huynh-Thu V. A. Saeys Y. Wehenkel L. Geurts P. (2012). Statistical interpretation of machine learning-based feature importance scores for biomarker discovery. Bioinformatics 28, 1766–1774. 10.1093/bioinformatics/bts23822539669
Huynh-Thu V. A Wehenkel L. Geurts P. (2008). Exploiting tree-based variable importances to selectively identify relevant variables, in JMLR:Workshop and Conference proceedings (Antwerp), 60–73.
Jack C. R. Petersen R. C. Xu Y. C. O'Brien P. C. Smith G. E. Ivnik R. J. et al. (1999). Prediction of AD with MRI-based hippocampal volume in mild cognitive impairment. Neurology 52, 1397–1397. 10227624
Jenatton R. Gramfort A. Michel V. Obozinski G. Eger E. Bach F. et al. (2012). Multiscale mining of fMRI data with hierarchical structured sparsity. SIAM J. Imaging Sci. 5, 835–856. 10.1137/110832380
Jolliffe I. T. (1986). Principal component analysis and factor analysis, in Principal Component Analysis (New York, NY: Springer). 10.1007/978-1-4757-1904-8_7
Killiany R. J. Gomez-Isla T. Moss M. Kikinis R. Sandor T. Jolesz F. et al. (2000). Use of structural magnetic resonance imaging to predict who will get Alzheimer's disease. Ann. Neurol. 47, 430–439. 10.1002/1531-8249(200004)47:4<430::AID-ANA5>3.0.CO;2-I10762153
Klöppel S. Stonnington C. M. Chu C. Draganski B. Scahill R. I. Rohrer J. D. et al. (2008). Automatic classification of MR scans in Alzheimer's disease. Brain 131, 681–689. 10.1093/brain/awm31918202106
Kuncheva L. Rodríguez J. J. Plumpton C. O. Linden D. E. Johnston S. J. (2010). Random subspace ensembles for fMRI classification, in IEEE Transactions on Medical Imaging, 531–542. 10.1109/TMI.2009.2037756
Langs G. Menze B. H. Lashkari D. Golland P. (2011). Detecting stable distributed patterns of brain activation using Gini contrast. Neuroimage 56, 497–507. 10.1016/j.neuroimage.2010.07.07420709176
Louppe G. Wehenkel L. Sutera A. Geurts P. (2013). Understanding variable importances in forests of randomized trees, in Advances in Neural Information Processing Systems (Lake Tahoe, CA), 431–439.
Michel V. Eger E. Keribin C. Poline J.-B. Thirion B. (2010). A supervised clustering approach for extracting predictive information from brain activation images, in Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on IEEE (San Francisco, CA), 7–14.
Morbelli S. Piccardo A. Villavecchia G. Dessi B. Brugnolo A. Piccini A. et al. (2010). Mapping brain morphological and functional conversion patterns in amnestic MCI: a voxel-based MRI and FDG-PET study. Eur. J. Nucl. Med. Mol. Imaging 37, 36–45. 10.1007/s00259-009-1218-619662411
Mourão-Miranda J. Bokde A. L. Born C. Hampel H. Stetter M. (2005). Classifying brain states and determining the discriminating activation patterns: support vector machine on functional MRI data. Neuroimage 28, 980–995. 10.1016/j.neuroimage.2005.06.07016275139
Mwangi B. Tian T. S. Soares J. C. (2014). A review of feature reduction techniques in neuroimaging. Neuroinformatics 12, 229–244. 10.1007/s12021-013-9204-324013948
Nielsen H. M. Chen K. Lee W. Chen Y. Bauer R. J. Reiman E. et al. (2017). Peripheral apoE isoform levels in cognitively normal APOE ε3/ε4 individuals are associated with regional gray matter volume and cerebral glucose metabolism. Alzheimers Res. Ther. 9:5. 10.1186/s13195-016-0231-928137305
Penny W. D. Friston K. J. Ashburner J. T. Kiebel S. J. Nichols T. E. (2011). Statistical Parametric Mapping: The Analysis of Functional Brain Images. New York, NY: Academic Press.
Petersen R. C. Doody R. Kurz A. Mohs R. C. Morris J. C. Rabins P. V. et al. (2001). Current concepts in mild cognitive impairment. Arch. Neurol. 58, 1985–1992. 10.1001/archneur.58.12.198511735772
Petersen R. C. Negash S. (2008). Mild cognitive impairment: an overview. CNS Spectr. 13, 45–53. 10.1017/s109285290001615118204414
Petersen R. C. Smith G. E. Waring S. C. Ivnik R. J. Tangalos E. G. Kokmen E. (1999). Mild cognitive impairment: clinical characterization and outcome. Arch. Neurol. 56, 303–308. 10190820
Rathore S. Habes M. Iftikhar M. A. Shacklett A. Davatzikos C. (2017). A review on neuroimaging-based classification studies and associated feature extraction methods for Alzheimer's disease and its prodromal stages. Neuroimage 155, 530–548. 10.1016/j.neuroimage.2017.03.05728414186
Roberson E. D. Mucke L. (2006). 100 years and counting: prospects for defeating Alzheimer's disease. Science 314, 781–784. 10.1126/science.113281317082448
Rombouts S. A. Barkhof F. Goekoop R. Stam C. J. Scheltens P. (2005). Altered resting state networks in mild cognitive impairment and mild Alzheimer's disease: an fMRI study. Hum. Brain Mapp. 26, 231–239. 10.1002/hbm.2016015954139
Ryali S. Supekar K. Abrams D. A. Menon V. (2010). Sparse logistic regression for whole-brain classification of fMRI data. Neuroimage, 51, 752–764. 10.1016/j.neuroimage.2010.02.04020188193
Schrouff J. Cremers J. Garraux G. Baldassarre L. Mourão-Miranda J. Phillips C. (2013). Localizing and comparing weight maps generated from linear kernel machine learning models, in Pattern Recognition in Neuroimaging (PRNI), 2013 International Workshop on IEEE (Philadelphia, PA), 124–127.
Schrouff J. Monteiro J. M. Portugal L. Rosa M. J. Phillips C. Mourão-Miranda J. (2018). Embedding anatomical or functional knowledge in whole-brain multiple Kernel learning models. Neuroinformatics 16, 117–143. 10.1007/s12021-017-9347-829297140
Segovia F. Górriz J. Ramírez J. Salas-Gonzalez D. Álvarez I. López M. et al. (2012). A comparative study of feature extraction methods for the diagnosis of Alzheimer's disease using the ADNI database. Neurocomputing 75, 64–71. 10.1016/j.neucom.2011.03.050
Sperling R. A. Rentz D. M. Johnson K. A. Karlawish J. Donohue M. Salmon D. P. et al. (2014). The A4 study: stopping AD before symptoms begin? Sci. Trans. Med. 6:228fs13. 10.1126/scitranslmed.300794124648338
Storey J. D. Tibshirani R. (2003). Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. 100, 9440–9445. 10.1073/pnas.153050910012883005
Strobl C. Boulesteix A. L. Zeileis A. Hothorn T. (2007). Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics 8:25. 10.1186/1471-2105-8-2517254353
Thirion B. Varoquaux G. Dohmatob E. Poline J.-B. (2014). Which fMRI clustering gives good brain parcellations? Front. Neurosci. 8:167. 10.3389/fnins.2014.0016725071425
Tibshirani R. (1996). Regression shrinkage and selection via the lasso. J. R. Stati. Soc. Ser. B Methodol. 58, 267–288.
Tuv E. Borisov A. Runger G. Torkkola K. (2009). Feature selection with ensembles, artificial variables, and redundancy elimination. J. Mach. Learn. Res. 10, 1341–1366.
Tzourio-Mazoyer N. Landeau B. Papathanassiou D. Crivello F. Etard O. Delcroix N. et al. (2002). Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage 15, 273–289. 10.1006/nimg.2001.097811771995
Vemuri P. Gunter J. L. Senjem M. L. Whitwell J. L. Kantarci K. Knopman D. S. et al. (2008). Alzheimer's disease diagnosis in individual subjects using structural MR images: validation studies. Neuroimage 39, 1186–1197. 10.1016/j.neuroimage.2007.09.07318054253
Wehenkel M. Bastin C. Phillips C. Geurts P. (2017). Tree ensemble methods and parcelling to identify brain areas related to Alzheimer's disease, in Pattern Recognition in Neuroimaging (PRNI), 2017 International Workshop on IEEE (Toronto, ON), 1–4.
Wold S. Ruhe A. Wold H. Dunn W. III (1984). The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses. SIAM J. Sci. Stat. Comput. 5, 735–743. 10.1137/0905052
Zhang D. Wang Y. Zhou L. Yuan H. Shen D. Alzheimer's Disease Neuroimaging Initiative et al. (2011). Multimodal classification of Alzheimer's disease and mild cognitive impairment. Neuroimage 55, 856–867. 10.1016/j.neuroimage.2011.01.00821236349
Zou H. Hastie T. (2005). Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67, 301–320. 10.1111/j.1467-9868.2005.00503.x
Similar publications
Sorry the service is unavailable at the moment. Please try again later.
This website uses cookies to improve user experience. Read more
Save & Close
Accept all
Decline all
Show detailsHide details
Cookie declaration
About cookies
Strictly necessary
Performance
Strictly necessary cookies allow core website functionality such as user login and account management. The website cannot be used properly without strictly necessary cookies.
This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.
Performance cookies are used to see how visitors use the website, eg. analytics cookies. Those cookies cannot be used to directly identify a certain visitor.
Used to store the attribution information, the referrer initially used to visit the website
Cookies are small text files that are placed on your computer by websites that you visit. Websites use cookies to help users navigate efficiently and perform certain functions. Cookies that are required for the website to operate properly are allowed to be set without your permission. All other cookies need to be approved before they can be set in the browser.
You can change your consent to cookie usage at any time on our Privacy Policy page.