Article (Scientific journals)
Ghost Loss to Question the Reliability of Training Data
Deliège, Adrien; Cioppa, Anthony; Van Droogenbroeck, Marc
2020In IEEE Access, 8, p. 44774-44782
Peer Reviewed verified by ORBi
 

Files


Full Text
Deliege2020Ghost.pdf
Author preprint (656.87 kB)
Ghost Loss to Question the Reliability of Training Data: full paper
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
classification; computer vision; deep learning; mislabeled data; noisy labels; ghost; training data; sanity matrix
Abstract :
[en] Supervised image classification problems rely on training data assumed to have been correctly annotated; this assumption underpins most works in the field of deep learning. In consequence, during its training, a network is forced to match the label provided by the annotator and is not given the flexibility to choose an alternative to inconsistencies that it might be able to detect. Therefore, erroneously labeled training images may end up “correctly” classified in classes which they do not actually belong to. This may reduce the performances of the network and thus incite to build more complex networks without even checking the quality of the training data. In this work, we question the reliability of the annotated datasets. For that purpose, we introduce the notion of ghost loss, which can be seen as a regular loss that is zeroed out for some predicted values in a deterministic way and that allows the network to choose an alternative to the given label without being penalized. After a proof of concept experiment, we use the ghost loss principle to detect confusing images and erroneously labeled images in well-known training datasets (MNIST, Fashion-MNIST, SVHN, CIFAR10) and we provide a new tool, called sanity matrix, for summarizing these confusions.
Research center :
Telim
Montefiore Institute - Montefiore Institute of Electrical Engineering and Computer Science - ULiège
Disciplines :
Computer science
Author, co-author :
Deliège, Adrien ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Télécommunications
Cioppa, Anthony ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Télécommunications
Van Droogenbroeck, Marc  ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Télécommunications
Language :
English
Title :
Ghost Loss to Question the Reliability of Training Data
Publication date :
04 March 2020
Journal title :
IEEE Access
ISSN :
2169-3536
Publisher :
Institute of Electrical and Electronics Engineers, United States - New Jersey
Volume :
8
Pages :
44774-44782
Peer reviewed :
Peer Reviewed verified by ORBi
Available on ORBi :
since 06 March 2020

Statistics


Number of views
70 (14 by ULiège)
Number of downloads
46 (7 by ULiège)

Scopus citations®
 
0
Scopus citations®
without self-citations
0
OpenCitations
 
0

Bibliography


Similar publications



Contact ORBi