Doctoral thesis (Dissertations and theses)
Addressing data scarcity with deep transfer learning and self-training in digital pathology
Mormont, Romain
2022
 

Files


Full Text
thesis_full.pdf
Author preprint (53.4 MB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
machine learning; digital pathology; transfer learning; multi-task learning; self-training; computational pathology; data scarcity
Abstract :
[en] Pathology, the field of medicine and biology interested in studying and diagnosing diseases, is on the brink of a revolution with technological advances in artificial intelligence and machine learning. Traditionally, in this field, the medium which has been used for research and diagnosis is a glass slide on which tissue and cell samples are applied and later analyzed under an optical microscope. Dedicated scanners are nowadays able to digitize these glass slides into large digital images called whole-slide-images which can then be reviewed on a computer. This new medium also offers unprecedented opportunities for computers to assist practitioners by automating the most time-consuming and tedious analysis tasks. The field which is interested in these digitization, automation and related topics is called digital pathology. Machine and deep learning methods are great candidates for tackling these automation tasks thanks to their ability to automatically learn models and capture complex patterns directly from data. However, digital pathology presents several challenges for learning methods. In particular, the field is suffering from data scarcity as data, especially annotated, is difficult to obtain because of privacy concerns, cost of annotations, etc. In this thesis, we explore different machine learning techniques tailored for tackling data scarcity. We first study different deep transfer learning techniques, a family of methods which consist in re-using a model that has been learned on a different task than the target task. We investigate best practices regarding how deep \acrlong{cnn} models pre-trained on ImageNet, a dataset of photographs, can be transferred to digital pathology image classification tasks. We notably show that, in digital pathology, fine-tuning outperforms feature extraction and draw other practical conclusions regarding transfer from ImageNet. Motivated by the fact that transfer performs better when the source and target tasks are close, we then use multi-task learning to pre-train a model on pathology data directly. We show that this technique is efficient for creating a transferrable model tailored for pathology tasks. Finally, we move to the topic of self-training, a family of methods where a model being learned is used to annotate unlabeled data that is then incorporated into the training process. In particular, we apply this technique to image segmentation for exploiting a dataset which has been only sparsely-labeled. We show that our approach is able to make use of the sparsely-labeled data better than a supervised approach.
Disciplines :
Computer science
Author, co-author :
Mormont, Romain  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Language :
English
Title :
Addressing data scarcity with deep transfer learning and self-training in digital pathology
Alternative titles :
[fr] Aborder la pénurie de donnée avec des techniques d'apprentissage profond par transfert et auto-apprentissage en pathologie digitale
Defense date :
September 2022
Number of pages :
xvi, 148 + 58
Institution :
ULiège - Université de Liège [Faculté des Sciences Appliquées], Liège, Belgium
Degree :
Doctor of Philosophy in Engineering Science
Promotor :
Geurts, Pierre ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Marée, Raphaël  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
President :
Louppe, Gilles  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Jury member :
Wehenkel, Louis  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Van Droogenbroeck, Marc  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Decaestecker, Christine;  ULB - Université Libre de Bruxelles [BE] > Ecole polytechnique de Bruxelles > Laboratory of Image Synthesis and Analysis (LISA)
Ciompi, Francesco;  Radboud University Nijmegen [NL] > Radbound University Medical Center (UMC) > Computational Pathology Group
Available on ORBi :
since 20 July 2022

Statistics


Number of views
218 (40 by ULiège)
Number of downloads
76 (28 by ULiège)

Bibliography


Similar publications



Contact ORBi