Advances in Digital Music Iconography: Benchmarking the detection of musical instruments in unrestricted, non-photorealistic images from the artistic domain
DHQ_ Digital Humanities Quarterly_ Advances in Digital Music Iconography_ Benchmarking the detection of musical instruments in unrestricted, non-photorealistic images from the artistic domain.pdf
transfer learning; image classification; image localization
Abstract :
[en] In this paper, we present MINERVA, the first benchmark dataset for the detection of musical
instruments in non-photorealistic, unrestricted image collections from the realm of the visual
arts. This effort is situated against the scholarly background of music iconography, an
interdisciplinary field at the intersection of musicology and art history. We benchmark a
number of state-of-the-art systems for image classification and object detection. Our results
demonstrate the feasibility of the task but also highlight the significant challenges which this
artistic material poses to computer vision. We evaluate the system to an out-of-sample
collection and offer an interpretive discussion of the false positives detected. The error
analysis yields a number of unexpected insights into the contextual cues that trigger the
detector. The iconography surrounding children and musical instruments, for instance,
shares some core properties, such as an intimacy in body language.
Disciplines :
Computer science
Author, co-author :
Sabatelli, Matthia ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique
Banar, Nikolay; University of Antwerp
Cocriamont, Marie; Royal Museums of Art and History
Coudyzer, Eva; Royal Institute for Cultural Heritage
Lasaracina, Karine; Royal Museums of Fine Arts of Belgium
Daelemans, Walter; University of Antwerp
Geurts, Pierre ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique
Kestemont, Mike; University of Antwerp
Language :
English
Title :
Advances in Digital Music Iconography: Benchmarking the detection of musical instruments in unrestricted, non-photorealistic images from the artistic domain
Publication date :
February 2021
Journal title :
Digital Humanities Quarterly
eISSN :
1938-4122
Publisher :
Northeastern University, Boston, United States - Massachusetts
Arnold and Tilton 2019 Arnold, T., and Tilton, L., “Distant viewing: Analyzing large visual corpora.” Digital Scholarship in the Humanities, 34 (2019), i3-i16.
Baldassarre 2007 Baldassarre, A. “Quo vadis music iconography? The Repertoire International d'Iconographie Musicale as a case study” Fontes Artis Musicae, 54 (2007), 440-452.
Baldassarre 2008 Baldassarre, A. “Music Iconography: What is it all about? Some remarks and considerations with a selected bibliography” Ictus: Periódico do Programa de Pós-Graduação em Música da UFBA, 9 (2008), 55-95.
Ballard and Brown 1982 Ballard, D. H., and Christopher M. Brown, C. M. Computer Vision, Upper Saddle River (1982).
Bell and Impett 2019 Bell, P., and Impett, L. “Ikonographie und Interaktion. Computergestützte Analyse von Posen in Bildern der Heilsgeschichte” Das Mittelalter, 24 (2019): 31-53.
Boyarski et al. 2017 Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Bernhard Firner, Larry J. Ackel, Urs Muller, Philip Yeres, Karol Zieba, “VisualBackProp: Efficient Visualization of CNNs for Autonomous Driving” Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018, 4701-4708. DOI: 10.1109/ICRA.2018.8461053.
Buckley 1998 Buckley, A. “Music Iconography and the Semiotics of Visual Representation” Music in Art, 23 (1998), 5-10.
Crowley and Zisserman 2014 Crowley, E., and Zisserman, A. “The State of the Art: Object Retrieval in Paintings using Discriminative Regions” In Valstar, M., French, A., and Pridmore, T. (eds), Proceedings of the British Machine Vision Conference, Nottingham (2014), s.p.
Dolan 2017 Dolan, E. I. “Review: MIMO: Musical Instrument Museums Online” Journal of the American Musicological Society, 70 (2017): 555-565.
Everingham et al. 2010 Everingham, M., Van Gool, L., Williams, C. K., Winn, J., and Zisserman, A. “The Pascal visual object classes (VOC) challenge” In International journal of computer vision, 88(2) (2010): 303-338.
Gonthier et al. 2018 Gonthier, N., Gousseau, Y., Ladjal, S. and Bonfait, O. “Weakly supervised object detection in artworks” In Proceedings of the European Conference on Computer Vision (ECCV) (2018): 692-709.
Green and Ferguson 2013 Green, A., and Ferguson, S. “RIDIM: Cataloguing music iconography since 1971” Fontes Artis Musicae, 60 (2013), 1-8.
He et al. 2016 He, K., Zhang, X., Ren, S., and Sun, J. “Deep residual learning for image recognition.” In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770-778, 2010.
Hockey 2004 Hockey, S. “A History of Humanities Computing.” In S. Schreibman, R. Siemens, and J. Unsworth (eds.), A Companion to Digital Humanities, Oxford (2004), pp. 3-19.
Huang et al. 2017 Huang, G., Zhuang, L., Van Der Maaten, L., and Weinberger, K. “Densely connected convolutional networks” In Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 4700-4708.
Kingma and Ba 2014 Kingma, D. P., and Ba, J. “A method for stochastic optimization” arXiv preprint arXiv:1412.6980, 2014.
LeCun et al. 2015 LeCun, J., Bengio, Y., and Hinton, G., “Deep Learning” Nature, 521 (2015): 436-444.
Lin et al. 2014 Lin T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P and Zitnick, C. L. “Microsoft COCO: Common objects in context” In European conference on computer vision, pages 740-755. Springer, 2014.
Marée et al. 2016 Marée, R., Rollus, L. Stévens, B., Hoyoux, R., Louppe, G., Vandaele, R., Begon, J., Kainz, P., Geurts, P., and Wehenkel “Collaborative analysis of multi-gigapixel imaging data using Cytomine” Bioinformatics, 32 (2016): 1395-1401.
Mensink and Van Gemert 2014 Mensink, T. and Van Gemert, J. “The Rijksmuseum challenge: Museum-centered visual recognition” In Proceedings of International Conference on Multimedia Retrieval, page 451. ACM, 2014.
Pedregosa et al. 2011 Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R. and Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. “Scikit-learn: Machine Learning in Python” Journal of Machine Learning Research, 12 (2011): 2825-2830.
Redmon and Farhadi 2018 Redmon, J. and Farhadi, A. “Yolov3: An incremental improvement” arXiv preprint arXiv:1804.02767, 2018.
Ren et al. 2017 S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks” IEEE Transactions on Pattern Analysis and Machine Intelligence, 39 (2017), 1137-1149.
Russakovsky et al. 2015 Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, K., Khosla, A., Bernstein, M. et al. “Imagenet large scale visual recognition challenge” International journal of computer vision, 115(3) (2015), 211-252.
Sabatelli et al. 2018 Sabatelli, M., Kestemont, M., Daelemans, W. and Geurts, P. “Deep transfer learning for art classification problems” In Proceedings of the European Conference on Computer Vision (ECCV), pages 631-646, 2018.
Schmidhuber 2015, Schmidhuber, J. “Deep Learning in Neural Networks: An Overview” Neural Networks, 61 (2015), 85-117.
Seguin 2018 Seguin, B. “The Replica Project: Building a visual search engine for art historians” XRDS: Crossroads, The ACM Magazine for Students - Computers and Art, 24 (2018), 24-29.
Simonyan and Zisserman 2014 Simonyan, K. and Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
Strezoski and Worring 2017 Strezoski, G. and Worring, M. “Omniart: Multi-task deep learning for artistic data analysis” arXiv preprint arXiv:1708.00684, 2017.
Szegedy et al. 2015 Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, S., Erhan, D., Vanhoucke, V., and Rabinovich, A. “Going deeper with convolutions” In Proceedings of the IEEE conference on computer vision and pattern recognition (2015), pp. 1-9.
Van et al. 2015 Van Noord, N., Hendriks, E., and Postma, E., “Toward Discovery of the Artist's Style: Learning to recognize artists by their artworks” IEEE Signal Processing Magazine, 32 (2015), 46-54.
Wevers and Smits 2020 Wevers M., and Smits, T. “The visual digital turn: Using neural networks to study historical images” Digital Scholarship in the Humanities, 35 (2020), 194-207.
Xiang et al. 2014 Xiang, Y., Mottaghi, R., and Savarese, S. “Beyond pascal: A benchmark for 3d object detection in the wild” In IEEE Winter Conference on Applications of Computer Vision, pages 75-82. IEEE, 2014.
Zou and Schiebinger 2018 Zou, J., and Schiebinger, L. “AI can be sexist and racist-it's time to make it fair” Nature, 559 (2018): 324-326.