Abstract :
[en] We study the generalization properties of pruned models that are the winners of the lottery ticket hypothesis on photorealistic datasets. We analyse their potential under conditions in which training data is scarce and comes from a not-photorealistic domain. More specifically, we investigate whether pruned models that are found on the popular CIFAR-10/100 and Fashion-MNIST datasets, generalize to seven different datasets coming from the fields of digital pathology and digital heritage. Our results show that there are significant benefits in training sparse architectures over larger parametrized models, since in all of our experiments pruned networks significantly outperform their larger unpruned counterparts. These results suggest that winning initializations do contain inductive biases that are generic to neural networks, although, as reported by our experiments on the biomedical datasets, their generalization properties can be more limiting than what has so far been observed in the literature.
Scopus citations®
without self-citations
3