ARTHuS: Adaptive Real-Time Human Segmentation in Sports through Online Distillation

semantic segmentation; human segmentation; real-time; online distillation; deep learning; artificial intelligence; computer vision; soccer; basketball; sports; players; ARTHuS; sport; football

Abstract :

[en] Semantic segmentation can be regarded as a useful tool for global scene understanding in many areas, including sports, but has inherent difficulties, such as the need for pixel-wise annotated training data and the absence of well-performing real-time universal algorithms. To alleviate these issues, we sacrifice universality by developing a general method, named ARTHuS, that produces adaptive real-time match-specific networks for human segmentation in sports videos, without requiring any manual annotation. This is done by an online knowledge distillation process, in which a fast student network is trained to mimic the output of an existing slow but effective universal teacher network, while being periodically updated to adjust to the latest play conditions. As a result, ARTHuS allows to build highly effective real-time human segmentation networks that evolve through the match and that sometimes outperform their teacher. The usefulness of producing adaptive match-specific networks and their excellent performances are demonstrated quantitatively and qualitatively for soccer and basketball matches.

Research Center/Unit :

Montefiore Institute - Montefiore Institute of Electrical Engineering and Computer Science - ULiège
Telim

Disciplines :

Electrical & electronics engineering

Author, co-author :

Cioppa, Anthony ^✱; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Télécommunications

Deliège, Adrien ^✱; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Télécommunications

Istasse, Maxime; UCLouvain

De Vleeschouwer, Christophe; UCLouvain

Van Droogenbroeck, Marc ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Télécommunications

^✱ These authors have contributed equally to this work.

Language :

English

Title :

ARTHuS: Adaptive Real-Time Human Segmentation in Sports through Online Distillation

Publication date :

June 2019

Event name :

IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) - CVSports

Event organizer :

IEEE

Event place :

Long Beach, United States - California

Event date :

from 16-06-2019 to 20-06-2019

Audience :

International

Journal title :

Conference on Computer Vision and Pattern Recognition Workshops

Pages :

2505-2514

Peer reviewed :

Peer reviewed

Name of the research project :

DeepSport

Funders :

DGTRE - Région wallonne. Direction générale des Technologies, de la Recherche et de l'Énergie

Commentary :

Best CVSports paper award 2019

Available on ORBi :

since 12 April 2019

Statistics

Number of views

1262 (156 by ULiège)

Number of downloads

735 (87 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

Bibliography

M. Braham, S. Piérard, and M. Van Droogenbroeck. Semantic background subtraction. In IEEE Int. Conf. Image Process. (ICIP), pages 4552-4556, Beijing, China, Sept. 2017.
M. Braham and M. Van Droogenbroeck. Deep background subtraction with scene-specific convolutional neural networks. In IEEE Int. Conf. Syst., Signals and Image Process. (IWSSIP), pages 1-4, May 2016.
C. Bucila, R. Caruana, and A. Niculescu-Mizil. Model compression. In ACM Int. Conf. Knowl. Disc. and Data Mining (KDD), pages 535-541, Philadelphia, PA, USA, Aug. 2006.
L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Eur. Conf. Comput. Vision (ECCV), volume 11211 of Lecture Notes Comp. Sci., pages 801-818. Springer, 2018.
A. Cioppa, A. Deliège, and M. Van Droogenbroeck. A bottom-up approach based on semantics for the interpretation of the main camera stream in soccer games. In Int. Workshop on Comput. Vision in Sports (CVsports), in conjunction with CVPR, pages 1846-1855, Salt Lake City, UT, USA, June 2018.
M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele. The cityscapes dataset for semantic urban scene understanding. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), pages 3213-3223, Las Vegas, NV, USA, June 2016.
T. Furlanello, Z. Lipton, M. Tschannen, L. Itti, and A. Anandkumar. Born again neural networks. In Int. Conf. Mach. Learn. (ICML), volume 80, pages 1607-1616, Stockholm, Sweden, July 2018.
A. Garcia-Garcia, S. Orts-Escolano, S. Oprea, V. Villena-Martinez, and J. Garcia-Rodriguez. A review on deep learning techniques applied to semantic segmentation. CoRR, abs/1704.06857, 2017.
S. Giancola, M. Amine, T. Dghaily, and B. Ghanem. SoccerNet: A scalable dataset for action spotting in soccer videos. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. Workshops (CVPRW), pages 1711-1721, Salt Lake City, UT, USA, June 2018.
K. He, G. Gkioxari, P. Dollar, and R. Girshick. Mask R-CNN. CoRR, abs/1703.06870, 2018.
G. Hinton, O. Vinyals, and J. Dean. Distilling the knowledge in a neural network. CoRR, abs/1503.02531, 2015.
M. Isogawa, D. Mikami, K. Takahashi, D. Iwai, K. Sato, and H. Kimata. Which is the better inpainted image? Training data generation without any manual operations. Int. J. Comp. Vision, online first, 2018.
D. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, Dec. 2014.
X. Liu, Z. Deng, and Y. Yang. Recent progress in semantic image segmentation. Artificial Intelligence Review, June 2018.
T. Moeslund, G. Thomas, and A. Hilton. Computer vision in sports. Springer, 2014.
F. Mueller, F. Bernard, O. Sotnychenko, D. Mehta, S. Sridhar, D. Casas, and C. Theobalt. GANerated hands for real-Time 3d hand tracking from monocular rgb. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), pages 49-59, 2018.
S. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10):1345-1359, Oct. 2010.
P. Parisot and C. D. Vleeschouwer. Scene-specific classifier for effective and efficient team sport players detection from a single calibrated camera. Comp. Vision and Image Understanding, 159:74-88, June 2017.
A. Paszke, A. Chaurasia, S. Kim, and E. Culurciello. ENet: A deep neural network architecture for realtime semantic segmentation. CoRR, abs/1606.02147, 2017.
L. Pishchulin, A. Jain, M. Andriluka, T. Thormahlen, and B. Schiele. Articulated people detection and pose estimation: Reshaping the future. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), pages 3178-3185, Providence, RI, USA, June 2012.
A. Romero, N. Ballas, S. Kahou, A. Chassang, C. Gatta, and Y. Bengio. FitNets: Hints for thin deep nets. CoRR, abs/1412.6550, 2015.
G. Ros, L. Sellart, J. Materzynska, D. Vázquez, and A. López. The SYNTHIA dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), pages 3234-3243, Las Vegas, NV, USA, June 2016.
D. Sahoo, Q. Pham, J. Lu, and S. Hoi. Online deep learning: Learning deep neural networks on the fly. In Int. Joint Conf. Artificial Intell. (IJCAI), pages 2660-2666, Stockholm, Sweden, July 2018.
G. Thomas, R. Gade, T. Moeslund, P. Carr, and A. Hilton. Computer vision for sports: current applications and research topics. Comp. Vision and Image Understanding, 159:3-18, June 2017.
Y. Wang, P.-M. Jodoin, F. Porikli, J. Konrad, Y. Benezeth, and P. Ishwar. CDnet 2014: An expanded change detection benchmark dataset. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. Workshops (CVPRW), pages 393-400, Columbus, Ohio, USA, June 2014.
K. Weiss, T. Khoshgoftaar, and D. Wang. A survey of transfer learning. Journal of Big Data, 3(1):1-9, May 2016.
T. Wu, S. Tang, R. Zhang, and Y. Zhang. Cgnet: A light-weight context guided network for semantic segmentation. CoRR, abs/1811.08201, 2018.
J. Xie, B. Shuai, J.-F. Hu, J. Lin, andW.-S. Zheng. Improving fast segmentation with teacher-student learning. In Brit. Mach. Vision Conf. (BMVC), pages 1-13, Newcastle, United Kingdom, Sept. 2018.
J. Yim, D. Joo, J. Bae, and J. Kim. A gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), pages 7130-7138, Honolulu, HI, USA, July 2017.
C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, and N. Sang. BiSeNet: Bilateral segmentation network for real-Time semantic segmentation. In Eur. Conf. Comput. Vision (ECCV), volume 11217 of Lecture Notes Comp. Sci., pages 325-341. Springer, 2018.
H. Zhao, X. Qi, X. Shen, J. Shi, and J. Jia. ICNet for real-Time semantic segmentation on high-resolution images. In Eur. Conf. Comput. Vision (ECCV), volume 11207 of Lecture Notes Comp. Sci., pages 418-434, 2018.
H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. Pyramid scene parsing network. In IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), pages 6230-6239, Honolulu, HI, USA, July 2017.

Similar publications

Sorry the service is unavailable at the moment. Please try again later.

Name	Provider / Domaine	Expiration	Description
JSESSIONID	Oracle Corporation www.uliege.be	Session	General purpose platform session cookie, used by sites written in JSP. Usually used to maintain an anonymous user session by the server.
CookieScriptConsent	CookieScript .uliege.be	1 year	This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.

Name	Provider / Domaine	Expiration	Description
_pk_id	InnoCraft Ltd .uliege.be	1 year	Used to store a few details about the user such as the unique visitor ID
_pk_ses	InnoCraft Ltd .uliege.be	30 minutes	Short lived cookies used to temporarily store data for the visit
_pk_ref	InnoCraft Ltd .uliege.be	6 months	Used to store the attribution information, the referrer initially used to visit the website