camera calibration; localization; soccer player; SoccerNet; action spotting; deep learning; computer vision
Abstract :
[en] Soccer broadcast video understanding has been drawing a lot of attention in recent years within data scientists and industrial companies. This is mainly due to the lucrative potential unlocked by effective deep learning techniques developed in the field of computer vision. In this work, we focus on the topic of camera calibration and on its current limitations for the scientific community. More precisely, we tackle the absence of a large-scale calibration dataset and of a public calibration network trained on such a dataset. Specifically, we distill a powerful commercial calibration tool in a recent neural network architecture on the large-scale SoccerNet dataset, composed of untrimmed broadcast videos of 500 soccer games. We further release our distilled network, and leverage it to provide 3 ways of representing the calibration results along with player localization. Finally, we exploit those representations within the current best architecture for the action spotting task of SoccerNet-v2, and achieve new state-of-the-art performances.
Research center :
Montefiore Institute - Montefiore Institute of Electrical Engineering and Computer Science - ULiège Telim
Disciplines :
Computer science
Author, co-author :
Cioppa, Anthony ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Télécommunications
Deliège, Adrien ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Télécommunications
Magera, Floriane ; Université de Liège - ULiège > Montefiore Institute
Giancola, Silvio
Barnich, Olivier
Ghanem, Bernard
Van Droogenbroeck, Marc ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Télécommunications
Language :
English
Title :
Camera Calibration and Player Localization in SoccerNet-v2 and Investigation of their Representations for Action Spotting
Publication date :
June 2021
Event name :
IEEE International Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), CVsports
Event organizer :
IEEE
Event place :
Nashville, TN, United States
Event date :
du 19 juin 2021 au 25 juin 2021
Audience :
International
Main work title :
IEEE International Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Peer reviewed :
Peer reviewed
Name of the research project :
DeepSport
Funders :
FRIA - Fonds pour la Formation à la Recherche dans l'Industrie et dans l'Agriculture [BE] DGTRE - Région wallonne. Direction générale des Technologies, de la Recherche et de l'Énergie [BE]
Commentary :
Paper accepted for the CVsports Workshop at CVPR2021.
Rockson Agyeman, Rafiq Muhammad, and Gyu Sang Choi. Soccer video summarization using deep learning. In IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pages 270-273, 2019. 1
Adrià Arbués Sangüesa, Adriàn Martín, Javier Fernández, Coloma Ballester, and Gloria Haro. Using player's bodyorientation to model pass feasibility in soccer. In IEEE Conf. Comput. Vis. Pattern Recog. Worksh., pages 3875-3884, 2020. 1
Jane Bromley, Isabelle Guyon, Yann LeCun, Eduard Säckinger, and Roopak Shah. Signature verification using a "siamese" time delay neural network. In Adv. Neural Inform. Process. Syst., pages 737-744, 1993. 3
Joao Carreira and Andrew Zisserman. Quo vadis, action recognition? A new model and the kinetics dataset. In IEEE Conf. Comput. Vis. Pattern Recog., pages 4724-4733, 2017. 3
Jianhui Chen and James J. Little. Sports camera calibration via synthetic data. In IEEE Conf. Comput. Vis. Pattern Recog. Worksh., pages 2497-2504, 2019. 2, 5, 6
Anthony Cioppa, Adrien Deliège, Silvio Giancola, Bernard Ghanem, Marc Van Droogenbroeck, Rikke Gade, and Thomas B. Moeslund. A context-aware loss function for action spotting in soccer videos. In IEEE Conf. Comput. Vis. Pattern Recog., pages 13126-13136, 2020. 1, 3, 6, 8
Anthony Cioppa, Adrien Deliège, Maxime Istasse, Christophe De Vleeschouwer, and Marc Van Droogenbroeck. ARTHuS: Adaptive Real-Time Human Segmentation in Sports Through Online Distillation. In IEEE Conf. Comput. Vis. Pattern Recog. Worksh., pages 2505-2514, 2019. 3
Anthony Cioppa, Adrien Deliège, Noor Ul Huda, Rikke Gade, Marc Van Droogenbroeck, and Thomas B. Moeslund. Multimodal and multiview distillation for real-time player detection on a football field. In IEEE Conf. Comput. Vis. Pattern Recog. Worksh., pages 3846-3855, 2020. 1
Anthony Cioppa, Adrien Deliège, and Marc Van Droogenbroeck. A Bottom-Up Approach Based on Semantics for the Interpretation of the Main Camera Stream in Soccer Games. In IEEE Conf. Comput. Vis. Pattern Recog. Worksh., 2018. 1, 4
Leonardo Citraro, Pablo Márquez-Neila, Stefano Savare, Vivek Jayaram, Charles Dubout, Félix Renaut, Andres Hasfura, Horesh Ben Shitrit, and Pascal Fua. Real-time camera pose estimation for sports fields. Machine Vision and Applications, pages 1-13, 2020. 2, 6
Tom Decroos, Lotte Bransen, Jan Van Haaren, and Jesse Davis. Actions speak louder than goals: Valuing player actions in soccer. In International Conference on Knowledge Discovery and Data Mining (KDD), page 1851-1861, 2019. 1
Adrien Deliège, Anthony Cioppa, Silvio Giancola, Meisam J. Seikavandi, Jacob V. Dueholm, Kamal Nasrollahi, Bernard Ghanem, Thomas B. Moeslund, and Marc Van Droogenbroeck. Soccernet-v2: A dataset and benchmarks for holistic understanding of broadcast soccer videos. CoRR, 2020. 1, 2, 3, 6, 7
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In IEEE Conf. Comput. Vis. Pattern Recog., pages 248-255, 2009. 3
Babak Fakhar, Hamidreza Rashidy Kanan, and Alireza Behrad. Event detection in soccer videos using unsupervised learning of spatio-temporal features based on pooled spatial pyramid model. Multimedia Tools and Applications, 78 (12): 16995-17025, 2019. 3
Silvio Giancola, Mohieddine Amine, Tarek Dghaily, and Bernard Ghanem. SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos. In IEEE Conf. Comput. Vis. Pattern Recog. Worksh., pages 1711-1721, 2018. 1, 2, 3, 5, 6, 7, 8
Richard Hartley and Andrew Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, 2004. 3
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. Mask R-CNN. In Int. Conf. Comput. Vis., pages 2980-2988, 2017. 3, 4
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. In IEEE Conf. Comput. Vis. Pattern Recog., pages 770-778, 2016. 3, 5
Namdar Homayounfar, Sanja Fidler, and Raquel Urtasun. Sports field localization via deep structured models. In IEEE Conf. Comput. Vis. Pattern Recog., pages 4012-4020, 2017. 2, 4, 5, 6
Samuel Hurault, Coloma Ballester, and Gloria Haro. Selfsupervised small soccer player detection and tracking. In The 3rd International Workshop on Multimedia Content Analysis in Sports (MMSports), page 9-18, 2020. 1
Haohao Jiang, Yao Lu, and Jing Xue. Automatic soccer video event detection based on a deep neural network combined CNN and rnn. In IEEE International Conference on Tools with Artificial Intelligence (ICTAI), pages 490-494, 2016. 3
Wei Jiang, Juan Camilo Gamboa Higuera, Baptiste Angles, Weiwei Sun, Mehrsan Javan, and Kwang Moo Yi. Optimizing through learned errors for accurate sports field registration. In IEEE Winter Conference on Applications of Computer Vision, 2020. 2, 6
Paresh R. Kamble, Avinash G. Keskar, and Kishor M. Bhurchandi. A deep learning ball tracking system in soccer videos. Opto-Electronics Review, 27 (1): 58-69, 2019. 1
Alex Kendall, Matthew Grimes, and Roberto Cipolla. Posenet: A convolutional network for real-time 6-dof camera relocalization. In Int. Conf. Comput. Vis., 2015. 2
Abdullah Khan, Beatrice Lazzerini, Gaetano Calabrese, and Luciano Seraf. Soccer event detection. In International Conference on Image Processing and Pattern Recognition (IPPR), 2018. 3
Muhammad Zeeshan Khan, Summra Saleem, Muhammad A. Hassan, and Muhammad Usman Ghanni Khan. Learning deep C3D features for soccer video event detection. In International Conference on Emerging Technologies (ICET), pages 1-6, 2018. 3
Victor Khaustov and Maxim Mozgovoy. Recognizing events in spatiotemporal soccer data. Applied Sciences, 10 (22), 2020. 3
David Lange. Market size of the european professional football market from 2006/07 to 2018/19. https://www. statista. com/statistics/261223/europeansoccer-market-total-revenue/. Accessed: March 3, 2021. 1
Guohao Li, Chenxin Xiong, Ali Thabet, and Bernard Ghanem. Deepergcn: All you need to train deeper gcns. CoRR, 2020. 7
Jikai Lu, Jianhui Chen, and James J. Little. Pan-tilt-zoom SLAM for sports videos. Brit. Mach. Vis. Conf., 2019. 2
Mehrtash Manafifard, Hamid Ebadi, and Hamid Abrishami Moghaddam. A survey on player tracking in soccer videos. Computer Vision and Image Understanding, 159: 19-46, 2017. 1
Thomas B. Moeslund, Graham Thomas, and Adrian Hilton. Computer Vision in Sports. Springer, 2014. 1
Olav A. Nergård Rongved, Steven A. Hicks, Vajira Thambawita, Håkon K. Stensland, Evi Zouganeli, Dag Johansen, Michael A. Riegler, and Pål Halvorsen. Real-time detection of events in soccer videos using 3D convolutional neural networks. In IEEE International Symposium on Multimedia (ISM), 2020. 3
O. Ronneberger, P. Fischer, and T. Brox. U-Net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention, pages 234-241, 2015. 3
Melissa Sanabria, Sherly, Frédéric Precioso, and Thomas Menguy. A deep architecture for multimodal summarization of soccer games. In ACM Int. Conf. Multimedia Worksh., page 16-24, 2019. 1
Long Sha, Jennifer Hobbs, Panna Felsen, XinyuWei, Patrick Lucey, and Sujoy Ganguly. End-to-end camera calibration for broadcast videos. In IEEE Conf. Comput. Vis. Pattern Recog., 2020. 2, 3, 4, 5, 6
Rahul Anand Sharma, Bharath Bhat, Vineet Gandhi, and C. V. Jawahar. Automated top view registration of broadcast football videos. In IEEE Winter Conference on Applications of Computer Vision, pages 305-313, 2018. 2, 6
Genki Suzuki, Sho Takahashi, Takahiro Ogawa, and Miki Haseyama. Team tactics estimation in soccer videos based on a deep extreme learning machine and characteristics of the tactics. IEEE Access, 7: 153238-153248, 2019. 1
Mingxing Tan and Quoc V. Le. Efficientnet: Rethinking model scaling for convolutional neural networks. In Int. Conf. Machine Learning, 2019. 5
Graham Thomas, Rikke Gade, Thomas B. Moeslund, Peter Carr, and Adrian Hilton. Computer vision for sports: Current applications and research topics. Computer Vision and Image Understanding, 159: 3-18, 2017. 1
Matteo Tomei, Lorenzo Baraldi, Simone Calderara, Simone Bronzin, and Rita Cucchiara. Rms-net: Regression and masking for soccer event spotting. In Int. Conf. Pattern Recog., 2020. 3
Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. Learning spatiotemporal features with 3D convolutional networks. In Int. Conf. Comput. Vis., pages 4489-4497, 2015. 3
Grigorios Tsagkatakis, Mustafa Jaber, and Panagiotis Tsakalides. Goal!! event detection in sports video. Electronic Imaging, 2017: 15-20, 2017. 3
Bastien Vanderplaetse and Stéphane Dupont. Improved soccer action spotting using both audio and video streams. In IEEE Conf. Comput. Vis. Pattern Recog. Worksh., pages 3921-3931, 2020. 3, 7, 8
Kanav Vats, Mehrnaz Fani, PascaleWalters, David A Clausi, and John Zelek. Event detection in coarsely annotated sports videos via parallel multi-receptive field 1d convolutions. In IEEE Conf. Comput. Vis. Pattern Recog. Worksh., pages 882-883, 2020. 3
Vizrt. Viz libero product of Vizrt. https://www. vizrt. com/en/products/viz-libero. 2