semantic segmentation; human segmentation; real-time; online distillation; deep learning; artificial intelligence; computer vision; soccer; basketball; sports; players; ARTHuS; sport; football
Abstract :
[en] Semantic segmentation can be regarded as a useful tool for global scene understanding in many areas, including sports, but has inherent difficulties, such as the need for pixel-wise annotated training data and the absence of well-performing real-time universal algorithms. To alleviate these issues, we sacrifice universality by developing a general method, named ARTHuS, that produces adaptive real-time match-specific networks for human segmentation in sports videos, without requiring any manual annotation. This is done by an online knowledge distillation process, in which a fast student network is trained to mimic the output of an existing slow but effective universal teacher network, while being periodically updated to adjust to the latest play conditions. As a result, ARTHuS allows to build highly effective real-time human segmentation networks that evolve through the match and that sometimes outperform their teacher. The usefulness of producing adaptive match-specific networks and their excellent performances are demonstrated quantitatively and qualitatively for soccer and basketball matches.
Research Center/Unit :
Montefiore Institute - Montefiore Institute of Electrical Engineering and Computer Science - ULiège Telim
Disciplines :
Electrical & electronics engineering
Author, co-author :
Cioppa, Anthony ✱; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Télécommunications
Deliège, Adrien ✱; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Télécommunications
Istasse, Maxime; UCLouvain
De Vleeschouwer, Christophe; UCLouvain
Van Droogenbroeck, Marc ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Télécommunications
✱ These authors have contributed equally to this work.
Language :
English
Title :
ARTHuS: Adaptive Real-Time Human Segmentation in Sports through Online Distillation
Publication date :
June 2019
Event name :
IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) - CVSports
Event organizer :
IEEE
Event place :
Long Beach, United States - California
Event date :
from 16-06-2019 to 20-06-2019
Audience :
International
Journal title :
Conference on Computer Vision and Pattern Recognition Workshops
Pages :
2505-2514
Peer reviewed :
Peer reviewed
Name of the research project :
DeepSport
Funders :
DGTRE - Région wallonne. Direction générale des Technologies, de la Recherche et de l'Énergie
M. Braham, S. Piérard, and M. Van Droogenbroeck. Semantic background subtraction. In IEEE Int. Conf. Image Process. (ICIP), pages 4552-4556, Beijing, China, Sept. 2017.
M. Braham and M. Van Droogenbroeck. Deep background subtraction with scene-specific convolutional neural networks. In IEEE Int. Conf. Syst., Signals and Image Process. (IWSSIP), pages 1-4, May 2016.
C. Bucila, R. Caruana, and A. Niculescu-Mizil. Model compression. In ACM Int. Conf. Knowl. Disc. and Data Mining (KDD), pages 535-541, Philadelphia, PA, USA, Aug. 2006.
L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Eur. Conf. Comput. Vision (ECCV), volume 11211 of Lecture Notes Comp. Sci., pages 801-818. Springer, 2018.
A. Cioppa, A. Deliège, and M. Van Droogenbroeck. A bottom-up approach based on semantics for the interpretation of the main camera stream in soccer games. In Int. Workshop on Comput. Vision in Sports (CVsports), in conjunction with CVPR, pages 1846-1855, Salt Lake City, UT, USA, June 2018.
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele. The cityscapes dataset for semantic urban scene understanding. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), pages 3213-3223, Las Vegas, NV, USA, June 2016.
T. Furlanello, Z. Lipton, M. Tschannen, L. Itti, and A. Anandkumar. Born again neural networks. In Int. Conf. Mach. Learn. (ICML), volume 80, pages 1607-1616, Stockholm, Sweden, July 2018.
A. Garcia-Garcia, S. Orts-Escolano, S. Oprea, V. Villena-Martinez, and J. Garcia-Rodriguez. A review on deep learning techniques applied to semantic segmentation. CoRR, abs/1704.06857, 2017.
S. Giancola, M. Amine, T. Dghaily, and B. Ghanem. SoccerNet: A scalable dataset for action spotting in soccer videos. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. Workshops (CVPRW), pages 1711-1721, Salt Lake City, UT, USA, June 2018.
K. He, G. Gkioxari, P. Dollar, and R. Girshick. Mask R-CNN. CoRR, abs/1703.06870, 2018.
G. Hinton, O. Vinyals, and J. Dean. Distilling the knowledge in a neural network. CoRR, abs/1503.02531, 2015.
M. Isogawa, D. Mikami, K. Takahashi, D. Iwai, K. Sato, and H. Kimata. Which is the better inpainted image? Training data generation without any manual operations. Int. J. Comp. Vision, online first, 2018.
D. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, Dec. 2014.
X. Liu, Z. Deng, and Y. Yang. Recent progress in semantic image segmentation. Artificial Intelligence Review, June 2018.
T. Moeslund, G. Thomas, and A. Hilton. Computer vision in sports. Springer, 2014.
F. Mueller, F. Bernard, O. Sotnychenko, D. Mehta, S. Sridhar, D. Casas, and C. Theobalt. GANerated hands for real-Time 3d hand tracking from monocular rgb. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), pages 49-59, 2018.
S. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10):1345-1359, Oct. 2010.
P. Parisot and C. D. Vleeschouwer. Scene-specific classifier for effective and efficient team sport players detection from a single calibrated camera. Comp. Vision and Image Understanding, 159:74-88, June 2017.
A. Paszke, A. Chaurasia, S. Kim, and E. Culurciello. ENet: A deep neural network architecture for realtime semantic segmentation. CoRR, abs/1606.02147, 2017.
L. Pishchulin, A. Jain, M. Andriluka, T. Thormahlen, and B. Schiele. Articulated people detection and pose estimation: Reshaping the future. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), pages 3178-3185, Providence, RI, USA, June 2012.
A. Romero, N. Ballas, S. Kahou, A. Chassang, C. Gatta, and Y. Bengio. FitNets: Hints for thin deep nets. CoRR, abs/1412.6550, 2015.
G. Ros, L. Sellart, J. Materzynska, D. Vázquez, and A. López. The SYNTHIA dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), pages 3234-3243, Las Vegas, NV, USA, June 2016.
D. Sahoo, Q. Pham, J. Lu, and S. Hoi. Online deep learning: Learning deep neural networks on the fly. In Int. Joint Conf. Artificial Intell. (IJCAI), pages 2660-2666, Stockholm, Sweden, July 2018.
G. Thomas, R. Gade, T. Moeslund, P. Carr, and A. Hilton. Computer vision for sports: current applications and research topics. Comp. Vision and Image Understanding, 159:3-18, June 2017.
Y. Wang, P.-M. Jodoin, F. Porikli, J. Konrad, Y. Benezeth, and P. Ishwar. CDnet 2014: An expanded change detection benchmark dataset. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. Workshops (CVPRW), pages 393-400, Columbus, Ohio, USA, June 2014.
K. Weiss, T. Khoshgoftaar, and D. Wang. A survey of transfer learning. Journal of Big Data, 3(1):1-9, May 2016.
T. Wu, S. Tang, R. Zhang, and Y. Zhang. Cgnet: A light-weight context guided network for semantic segmentation. CoRR, abs/1811.08201, 2018.
J. Xie, B. Shuai, J.-F. Hu, J. Lin, andW.-S. Zheng. Improving fast segmentation with teacher-student learning. In Brit. Mach. Vision Conf. (BMVC), pages 1-13, Newcastle, United Kingdom, Sept. 2018.
J. Yim, D. Joo, J. Bae, and J. Kim. A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), pages 7130-7138, Honolulu, HI, USA, July 2017.
C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, and N. Sang. BiSeNet: Bilateral segmentation network for real-Time semantic segmentation. In Eur. Conf. Comput. Vision (ECCV), volume 11217 of Lecture Notes Comp. Sci., pages 325-341. Springer, 2018.
H. Zhao, X. Qi, X. Shen, J. Shi, and J. Jia. ICNet for real-Time semantic segmentation on high-resolution images. In Eur. Conf. Comput. Vision (ECCV), volume 11207 of Lecture Notes Comp. Sci., pages 418-434, 2018.
H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. Pyramid scene parsing network. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), pages 6230-6239, Honolulu, HI, USA, July 2017.
Similar publications
Sorry the service is unavailable at the moment. Please try again later.
This website uses cookies to improve user experience. Read more
Save & Close
Accept all
Decline all
Show detailsHide details
Cookie declaration
About cookies
Strictly necessary
Performance
Strictly necessary cookies allow core website functionality such as user login and account management. The website cannot be used properly without strictly necessary cookies.
This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.
Performance cookies are used to see how visitors use the website, eg. analytics cookies. Those cookies cannot be used to directly identify a certain visitor.
Used to store the attribution information, the referrer initially used to visit the website
Cookies are small text files that are placed on your computer by websites that you visit. Websites use cookies to help users navigate efficiently and perform certain functions. Cookies that are required for the website to operate properly are allowed to be set without your permission. All other cookies need to be approved before they can be set in the browser.
You can change your consent to cookie usage at any time on our Privacy Policy page.