Addressing data scarcity with deep transfer learning and self-training in digital pathology

Mormont, Romain

Download

Doctoral thesis (Dissertations and theses)

Addressing data scarcity with deep transfer learning and self-training in digital pathology

Mormont, Romain

2022

Permalink
https://hdl.handle.net/2268/293358

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

thesis_full.pdf

Author preprint (53.4 MB)

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

machine learning; digital pathology; transfer learning; multi-task learning; self-training; computational pathology; data scarcity

Abstract :

[en] Pathology, the field of medicine and biology interested in studying and diagnosing diseases, is on the brink of a revolution with technological advances in artificial intelligence and machine learning. Traditionally, in this field, the medium which has been used for research and diagnosis is a glass slide on which tissue and cell samples are applied and later analyzed under an optical microscope. Dedicated scanners are nowadays able to digitize these glass slides into large digital images called whole-slide-images which can then be reviewed on a computer. This new medium also offers unprecedented opportunities for computers to assist practitioners by automating the most time-consuming and tedious analysis tasks. The field which is interested in these digitization, automation and related topics is called digital pathology. Machine and deep learning methods are great candidates for tackling these automation tasks thanks to their ability to automatically learn models and capture complex patterns directly from data. However, digital pathology presents several challenges for learning methods. In particular, the field is suffering from data scarcity as data, especially annotated, is difficult to obtain because of privacy concerns, cost of annotations, etc. In this thesis, we explore different machine learning techniques tailored for tackling data scarcity. We first study different deep transfer learning techniques, a family of methods which consist in re-using a model that has been learned on a different task than the target task. We investigate best practices regarding how deep \acrlong{cnn} models pre-trained on ImageNet, a dataset of photographs, can be transferred to digital pathology image classification tasks. We notably show that, in digital pathology, fine-tuning outperforms feature extraction and draw other practical conclusions regarding transfer from ImageNet. Motivated by the fact that transfer performs better when the source and target tasks are close, we then use multi-task learning to pre-train a model on pathology data directly. We show that this technique is efficient for creating a transferrable model tailored for pathology tasks. Finally, we move to the topic of self-training, a family of methods where a model being learned is used to annotate unlabeled data that is then incorporated into the training process. In particular, we apply this technique to image segmentation for exploiting a dataset which has been only sparsely-labeled. We show that our approach is able to make use of the sparsely-labeled data better than a supervised approach.

Disciplines :

Computer science

Author, co-author :

Mormont, Romain ; Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science

Language :

English

Title :

Addressing data scarcity with deep transfer learning and self-training in digital pathology

Alternative titles :

[fr] Aborder la pénurie de donnée avec des techniques d'apprentissage profond par transfert et auto-apprentissage en pathologie digitale

Defense date :

September 2022

Number of pages :

xvi, 148 + 58

Institution :

ULiège - Université de Liège [Faculté des Sciences Appliquées], Liège, Belgium

Degree :

Doctor of Philosophy in Engineering Science

Promotor :

Geurts, Pierre ; Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science

Marée, Raphaël ; Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science

President :

Louppe, Gilles ; Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science

Jury member :

Wehenkel, Louis ; Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science

Van Droogenbroeck, Marc ; Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science

Decaestecker, Christine; ULB - Université Libre de Bruxelles [BE] > Ecole polytechnique de Bruxelles > Laboratory of Image Synthesis and Analysis (LISA)

Ciompi, Francesco; Radboud University Nijmegen [NL] > Radbound University Medical Center (UMC) > Computational Pathology Group

Available on ORBi :

since 20 July 2022

Statistics

Number of views

341 (54 by ULiège)

Number of downloads

271 (35 by ULiège)

More statistics