Paper published in a book (Scientific congresses and symposiums)
Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders
Eymaël, Alexandre; Vandeghen, Renaud; Cioppa, Anthony et al.
2024In European Conference on Computer Vision
Peer reviewed
 

Files


Full Text
Eymael2024Efficient-arxiv.pdf
Publisher postprint (25.16 MB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
Self-supervised learning; Masked autoencoders; Siamese networks; Video segmentation; Label propagation
Abstract :
[en] Self-supervised pre-training of image encoders is omnipresent in the literature, particularly following the introduction of Masked autoencoders (MAE). Current efforts attempt to learn object-centric representations from motion in videos. In particular, SiamMAE recently introduced a Siamese network, training a shared-weight encoder from two frames of a video with a high asymmetric masking ratio (95%). In this work, we propose CropMAE, an alternative approach to the Siamese pre-training introduced by SiamMAE. Our method specifically differs by exclusively considering pairs of cropped images sourced from the same image but cropped differently, deviating from the conventional pairs of frames extracted from a video. CropMAE therefore alleviates the need for video datasets, while maintaining competitive performances and drastically reducing pre-training time. Furthermore, we demonstrate that CropMAE learns similar object-centric representations without explicit motion, showing that current self-supervised learning methods do not learn objects from motion, but rather thanks to the Siamese architecture. Finally, CropMAE achieves the highest masking ratio to date (98.5%), enabling the reconstruction of images using only two visible patches. Our code is available at https://github.com/alexandre-eymael/CropMAE.
Research Center/Unit :
TELIM
Montefiore Institute - Montefiore Institute of Electrical Engineering and Computer Science - ULiège
VIULab
Disciplines :
Computer science
Author, co-author :
Eymaël, Alexandre  ;  University of Liège, Belgium
Vandeghen, Renaud   ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Cioppa, Anthony  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science ; KAUST, Saudi Arabia
Giancola, Silvio;  KAUST, Saudi Arabia
Ghanem, Bernard;  KAUST, Saudi Arabia
Van Droogenbroeck, Marc  ;  Université de Liège - ULiège > Département d'électricité, électronique et informatique (Institut Montefiore) > Télécommunications
 These authors have contributed equally to this work.
Language :
English
Title :
Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders
Publication date :
31 October 2024
Event name :
European Conference on Computer Vision (ECCV)
Event organizer :
ECVA
Event place :
Milan, Italy
Event date :
September 29 to October 4, 2024
Event number :
18
Audience :
International
Main work title :
European Conference on Computer Vision
Publisher :
Springer
Collection name :
Lecture Notes in Computer Science, volume 15081
Pages :
348–366
Peer review/Selection committee :
Peer reviewed
Tags :
CÉCI : Consortium des Équipements de Calcul Intensif
Tier-1 supercalculateur
European Projects :
H2020 - 951732 - EUROCC - National Competence Centres in the framework of EuroHPC
Name of the research project :
Lucia
Funders :
F.R.S.-FNRS - Fonds de la Recherche Scientifique
SPW - Public Service of Wallonia
European Union
Funding number :
1910247
Funding text :
A. Cioppa is funded by the F.R.S.-FNRS. The research reported in this publication was supported by funding from KAUST Center of Excellence on GenAI, under award number 5940, and the SDAIA-KAUST Center of Excellence in Data Science and Artificial Intelligence. The present research benefited from computational resources made available on Lucia, the Tier-1 supercomputer of the Walloon Region, infrastructure funded by the Walloon Region under the grant agreement no 1910247. We acknowledge EuroCC Belgium for awarding this project access to the LUMI supercomputer, owned by the EuroHPC Joint Undertaking, hosted by CSC (Finland) and the LUMI consortium.
Available on ORBi :
since 27 March 2024

Statistics


Number of views
291 (38 by ULiège)
Number of downloads
70 (11 by ULiège)

Scopus citations®
 
3
Scopus citations®
without self-citations
3
OpenCitations
 
0
OpenAlex citations
 
5

Bibliography


Similar publications



Contact ORBi