Aerial wildlife censuses; Annotation effort; Density estimation; Object detection; Object localization; Point annotations; Pseudo labels; Ecology, Evolution, Behavior and Systematics; Modeling and Simulation; Ecology; Ecological Modeling; Computer Science Applications; Computational Theory and Mathematics; Applied Mathematics
Abstract :
[en] Aircraft-based monitoring of wildlife is a popular way among conservation practitioners to obtain animal population counts over large areas. Nowadays, these aerial censuses are becoming increasingly scalable due to the advent of drone technology, which is frequently combined with deep learning-based image recognition. Yet, the annotation burden associated with training deep learning architectures remains a problem especially for commonly used bounding box detection models. Point-based density estimation- and localization models are cheaper to train, and often work better when the aerial imagery is recorded at an oblique angle. Beyond this, though, there currently is little consensus about which strategy to use for what kind of data. In this work, we address this knowledge gap and evaluate modifications to a state-of-the-art detection model (YOLOv8) that minimize labeling efforts by enabling it to work on point-annotated images. We study the effect of these adjustments on detection accuracy and extensively compare them to a localization architecture on four datasets consisting of nadir and oblique images. The goal of this paper is to offer wildlife conservationists practical advice on which of the recently proposed deep learning architectures to use given the properties of their images, as well as on the data properties that will maximize model performance independently of the architecture. We find that counting accuracy can largely be maintained at reduced annotation effort, that object detection technology outperforms the localization approach on nadir images, and that it shows competitive performance in the oblique setting. The images used to obtain the results presented in this paper can be found on Zenodo for all publicly available datasets, as well as all code necessary to reproduce our results was uploaded to GitHub.
Disciplines :
Life sciences: Multidisciplinary, general & others Computer science
Author, co-author :
May, Giacomo ; Environmental Computational Science and Earth Observation Laboratory, École Polytechnique Fédérale de Lausanne, Sion, Switzerland
Dalsasso, Emanuele; Environmental Computational Science and Earth Observation Laboratory, École Polytechnique Fédérale de Lausanne, Sion, Switzerland
Delplanque, Alexandre ; Université de Liège - ULiège > Département GxABT > Gestion des ressources forestières
Kellenberger, Benjamin; Centre for Biodiversity and Environmental Research, University College London, London, United Kingdom ; Department of Ecology and Evolutionary Biology, Yale University, Osborn Memorial Laboratories, New Haven, United States
Tuia, Devis; Environmental Computational Science and Earth Observation Laboratory, École Polytechnique Fédérale de Lausanne, Sion, Switzerland
Language :
English
Title :
How to minimize the annotation effort in aerial wildlife surveys
This research has been carried out as part of the project WildDrone, funded by the European Union's Horizon Europe Research and Innovation Program under the Marie Sk\u0142odowska-Curie Grant Agreement No. 101071224, the EPSRC funded Autonomous Drones for Nature Conservation Missions grant (EP/X029077/1), and the Swiss State Secretariat for Education, Research and Innovation (SERI) under contract number 22.00280.This research has been carried out as part of the project WildDrone, funded by the European Union\u2019s Horizon Europe Research and Innovation Program under the Marie Sk\u0142odowska-Curie Grant Agreement No. 101071224 , the EPSRC funded Autonomous Drones for Nature Conservation Missions grant ( EP/X029077/1 ), and the Swiss State Secretariat for Education, Research and Innovation (SERI) under contract number 22.00280 .
Ahmed, T., Maaz, A., Mahmood, D., ul Abideen, Z., Arshad, U., Ali, R.H., The yolov8 edge: Harnessing custom datasets for superior real-time detection. 2023 18th International Conference on Emerging Technologies, ICET, 2023, IEEE, 38–43.
Akyon, F.C., Altinuc, S.O., Temizel, A., Slicing aided hyper inference and fine-tuning for small object detection. 2022 IEEE Int. Conf. Image Process., ICIP, 2022, 966–970, 10.1109/ICIP46576.2022.9897990.
Alzubaidi, L., Zhang, J., Humaidi, A.J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., Santamaría, J., Fadhel, M.A., Al-Amidie, M., Farhan, L., Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J. Big Data 8 (2021), 1–74.
Andrew, W., Gao, J., Mullan, S., Campbell, N., Dowsey, A.W., Burghardt, T., Visual identification of individual holstein-friesian cattle via deep metric learning. Comput. Electron. Agric., 185, 2021, 106133.
Arteta, C., Lempitsky, V., Zisserman, A., Counting in the wild. Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, the Netherlands, October 11–14, 2016, Proceedings, Part VII 14, 2016, Springer, 483–498.
Ascagorta, O., Pollicelli, M.D., Iaconis, F.R., Eder, E., Vázquez-Sano, M., Delrieux, C., Large-Scale Coastal marine wildlife monitoring with aerial imagery. J. Imaging, 11(4), 2025, 94.
Attard, M.R., Phillips, R.A., Bowler, E., Clarke, P.J., Cubaynes, H., Johnston, D.W., Fretwell, P.T., Review of satellite remote sensing and unoccupied aircraft systems for counting wildlife on land. Remote. Sens., 16(4), 2024, 627.
Bai, H., Mao, J., Chan, S.H.G., A survey on deep learning-based single image crowd counting: Network design, loss function and supervisory signal. Neurocomputing 508 (2022), 1–18.
Bhavana, N., Kodabagi, M.M., Kumar, B.M., Ajay, P., Muthukumaran, N., Ahilan, A., POT-YOLO: Real-time road potholes detection using edge segmentation based yolo V8 network. IEEE Sens. J., 2024.
Bondi, E., Jain, R., Aggrawal, P., Anand, S., Hannaford, R., Kapoor, A., Piavis, J., Shah, S., Joppa, L., Dilkina, B., Tambe, M., BIRDSAI: A dataset for detection and tracking in aerial thermal infrared videos. 2020 IEEE Winter Conference on Applications of Computer Vision, WACV, 2020, 1736–1745, 10.1109/WACV45572.2020.9093284.
Borowicz, A., Le, H., Humphries, G., Nehls, G., Höschle, C., Kosarev, V., Lynch, H.J., Aerial-trained deep learning networks for surveying cetaceans from satellite imagery. PLoS One, 14(10), 2019, e0212532.
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S., End-to-end object detection with transformers. European Conference on Computer Vision, 2020, Springer, 213–229.
Ceballos, G., Ehrlich, P.R., Raven, P.H., Vertebrates on the brink as indicators of biological annihilation and the sixth mass extinction. Proc. Natl. Acad. Sci. 117:24 (2020), 13596–13602.
Chen, A., Jacob, M., Shoshani, G., Charter, M., Using computer vision, image analysis and UAVs for the automatic recognition and counting of common cranes (Grus grus). J. Environ. Manag., 328, 2023, 116948.
Chen, T., Zhu, L., Ding, C., Cao, R., Zhang, S., Wang, Y., Li, Z., Sun, L., Mao, P., Zang, Y., Sam fails to segment anything?–sam-adapter: Adapting sam in underperformed scenes: Camouflage, shadow, and more. 2023, 7 arXiv preprint arXiv:2304.09148.
Converse, R.L., Lippitt, C.D., Koneff, M.D., White, T.P., Weinstein, B.G., Gibbons, R., Stewart, D.R., Fleishman, A.B., Butler, M.J., Sesnie, S.E., et al. Remote sensing and machine learning to improve aerial wildlife population surveys. Front. Conserv. Sci., 5, 2024, 1416706.
Delplanque, A., Foucher, S., Lejeune, P., Linchant, J., Théau, J., Multispecies detection and identification of african mammals in aerial imagery using convolutional neural networks. Remote. Sens. Ecol. Conserv. 8:2 (2022), 166–179.
Delplanque, A., Foucher, S., Théau, J., Bussière, E., Vermeulen, C., Lejeune, P., From crowd to herd counting: How to precisely detect and count african mammals using aerial imagery and deep learning?. ISPRS J. Photogramm. Remote Sens. 197 (2023), 167–180.
Delplanque, A., Linchant, J., Vincke, X., Lamprey, R., Théau, J., Vermeulen, C., Foucher, S., Ouattara, A., Kouadio, R., Lejeune, P., Will artificial intelligence revolutionize aerial surveys? A first large-scale semi-automated survey of african wildlife using oblique imagery and deep learning. Ecol. Inform., 2024, 102679.
Desgarnier, L., Mouillot, D., Vigliola, L., Chaumont, M., Mannocci, L., Putting eagle rays on the map by coupling aerial video-surveys and deep learning. Biol. Cons., 267, 2022, 109494.
Doll, O., Loos, A., 2023. Comparison of Object Detection Algorithms for Livestock Monitoring of Sheep in UAV images. In: Int. Workshop Camera Traps, AI, and Ecology.
Dujon, A.M., Ierodiaconou, D., Geeson, J.J., Arnould, J.P., Allan, B.M., Katselidis, K.A., Schofield, G., Machine learning to detect marine animals in UAV imagery: Effect of morphology, spacing, behaviour and habitat. Remote. Sens. Ecol. Conserv. 7:3 (2021), 341–354.
Duporge, I., Isupova, O., Reece, S., Macdonald, D.W., Wang, T., Using very-high-resolution satellite imagery and deep learning to detect and count african elephants in heterogeneous landscapes. Remote. Sens. Ecol. Conserv. 7:3 (2021), 369–381.
Eikelboom, J.A., Wind, J., van de Ven, E., Kenana, L.M., Schroder, B., de Knegt, H.J., van Langevelde, F., Prins, H.H., Improving the precision and accuracy of animal population estimates with aerial image object detection. Methods Ecol. Evol. 10:11 (2019), 1875–1887.
Fabian, Z., Miao, Z., Li, C., Zhang, Y., Liu, Z., Hernández, A., Montes-Rojas, A., Escucha, R., Siabatto, L., Link, A., et al. Multimodal foundation models for zero-shot animal species recognition in camera trap images. 2023 arXiv preprint arXiv:2311.01064.
Gabeff, V., Rußwurm, M., Tuia, D., Mathis, A., Wildclip: Scene and animal attribute retrieval from camera trap data with domain-adapted vision-language models. Int. J. Comput. Vis. 132:9 (2024), 3770–3786.
Gao, G., Gao, J., Liu, Q., Wang, Q., Wang, Y., Cnn-based density estimation and crowd counting: A survey. 2003 arXiv 2020, arXiv preprint arXiv:2003.12783.
Ge, Y., Zhou, Q., Wang, X., Shen, C., Wang, Z., Li, H., Point-teaching: weakly semi-supervised object detection with point annotations. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, 2023, 667–675.
Girshick, R., 2015. Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 1440–1448.
Gonçalves, B.C., Spitzbart, B., Lynch, H.J., SealNet: A fully-automated pack-ice seal detection pipeline for sub-meter satellite imagery. Remote Sens. Environ., 239, 2020, 111617.
Hoekendijk, J.P., Kellenberger, B., Aarts, G., Brasseur, S., Poiesz, S.S., Tuia, D., Counting using deep learning regression gives value to ecological surveys. Sci. Rep., 11(1), 2021, 23209.
Hong, S.J., Han, Y., Kim, S.Y., Lee, A.Y., Kim, G., Application of deep-learning methods to bird detection using unmanned aerial vehicle imagery. Sensors, 19(7), 2019, 1651.
Jachmann, H., Estimating Abundance of African Wildlife: An Aid to Adaptive Management. 2012, Springer Science & Business Media.
Ji, G.P., Fan, D.P., Xu, P., Cheng, M.M., Zhou, B., Van Gool, L., SAM struggles in concealed scenes–empirical study on segment anything. 2023 arXiv preprint arXiv:2304.06022.
Jiang, J., Hu, Y.C., Tyagi, N., Zhang, P., Rimner, A., Deasy, J.O., Veeraraghavan, H., Cross-modality (CT-MRI) prior augmented deep learning for robust lung tumor segmentation from small MR datasets. Med. Phys. 46:10 (2019), 4392–4404.
Jrondi, Z., Moussaid, A., Hadi, M.Y., Exploring end-to-end object detection with transformers versus YOLOv8 for enhanced citrus fruit detection within trees. Syst. Soft Comput., 6, 2024, 200103.
Karim, M.J., Nahiduzzaman, M., Ahsan, M., Haider, J., Development of an early detection and automatic targeting system for cotton weeds using an improved lightweight YOLOv8 architecture on an edge device. Knowl.-Based Syst., 300, 2024, 112204.
Kellenberger, B., Marcos, D., Tuia, D., Detecting mammals in UAV images: Best practices to address a substantially imbalanced dataset with deep learning. Remote Sens. Environ. 216 (2018), 139–153.
Kellenberger, B., Marcos, D., Tuia, D., 2019. When a few clicks make all the difference: Improving weakly-supervised wildlife detection in UAV images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.
Kellenberger, B., Volpi, M., Tuia, D., Fast animal detection in UAV images using convolutional neural networks. 2017 IEEE International Geoscience and Remote Sensing Symposium, IGARSS, 2017, IEEE, 866–869.
Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.Y., et al., 2023. Segment anything. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 4015–4026.
Koger, B., Deshpande, A., Kerby, J.T., Graving, J.M., Costelloe, B.R., Couzin, I.D., Quantifying the movement, behaviour and environmental context of group-living animals using drones and computer vision. J. Anim. Ecol. 92:7 (2023), 1357–1371.
Lalgudi, C.K., Leone, M.E., Clark, J.V., Madrigal-Mora, S., Espinoza, M., Zero-shot shark tracking and biometrics from aerial imagery. 2025 arXiv preprint arXiv:2501.05717.
Laradji, I.H., Rostamzadeh, N., Pinheiro, P.O., Vazquez, D., Schmidt, M., 2018. Where are the blobs: Counting by localization with point supervision. In: Proceedings of the European Conference on Computer Vision. ECCV, pp. 547–562.
Lempitsky, V., Zisserman, A., Learning to count objects in images. Adv. Neural Inf. Process. Syst., 23, 2010.
Liang, D., Xu, W., Zhu, Y., Zhou, Y., Focal inverse distance transform maps for crowd localization. IEEE Trans. Multimed. 25 (2022), 6040–6052.
Linchant, J., Lisein, J., Semeki, J., Lejeune, P., Vermeulen, C., Are unmanned aircraft systems (UAS s) the future of wildlife monitoring? A review of accomplishments and challenges. Mammal Rev. 45:4 (2015), 239–252.
Liu, Y., Li, W., Liu, X., Li, Z., Yue, J., Deep learning in multiple animal tracking: A survey. Comput. Electron. Agric., 224, 2024, 109161.
Liu, N., Long, Y., Zou, C., Niu, Q., Pan, L., Wu, H., 2019. Adcrowdnet: An attention-injective deformable convolutional network for crowd understanding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 3225–3234.
Lyons, M.B., Brandis, K.J., Murray, N.J., Wilshire, J.H., McCann, J.A., Kingsford, R.T., Callaghan, C.T., Monitoring large and complex wildlife aggregations with drones. Methods Ecol. Evol. 10:7 (2019), 1024–1035.
Maire, F., Alvarez, L.M., Hodgson, A., Automating marine mammal detection in aerial images captured during wildlife surveys: a deep learning approach. AI 2015: Advances in Artificial Intelligence: 28th Australasian Joint Conference, Canberra, ACT, Australia, November 30–December 4, 2015, Proceedings 28, 2015, Springer, 379–385.
May, G., Dalsasso, E., Kellenberger, B., Tuia, D., POLO–point-based, multi-class animal detection. 2024 arXiv preprint arXiv:2410.11741.
Meena, S.D., Manichandana, K.B.V., Potlur, R.S., Dhanyasri, M., Harshith, P., Sheela, J., Aerial imaging based sea lion count using modified U-net architecture. AIP Conference Proceedings, vol. 2869, 2023, AIP Publishing.
Morera, A., Foundation models in shaping the future of ecology. Ecol. Inform., 80, 2024, 102545.
Mou, C., Liu, T., Zhu, C., Cui, X., Waid: A large-scale dataset for wildlife detection with drones. Appl. Sci., 13(18), 2023, 10397.
Naidu, A.P., Gosalia, H., Gakhar, I., Rathore, S.S., Didwania, K., Verma, U., DEAL-YOLO: Drone-based efficient animal localization using YOLO. 2025 arXiv preprint arXiv:2503.04698.
Norouzzadeh, M.S., Nguyen, A., Kosmala, M., Swanson, A., Palmer, M.S., Packer, C., Clune, J., Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning. Proc. Natl. Acad. Sci. 115:25 (2018), E5716–E5725.
Ottichilo, W.K., De Leeuw, J., Skidmore, A.K., Prins, H.H., Said, M.Y., Population trends of large non-migratory wild herbivores and livestock in the Masai Mara ecosystem, Kenya, between 1977 and 1997. Afr. J. Ecol. 38:3 (2000), 202–216.
Ozge Unel, F., Ozkalayci, B.O., Cigla, C., 2019. The power of tiling for small object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.
Padubidri, C., Kamilaris, A., Karatsiolis, S., Kamminga, J., Counting sea lions and elephants from aerial photography using deep learning with density maps. Anim. Biotelemetry, 9(1), 2021, 27.
Papadopoulos, D.P., Uijlings, J.R., Keller, F., Ferrari, V., 2017. Extreme clicking for efficient object annotation. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 4930–4939.
Peng, J., Wang, D., Liao, X., Shao, Q., Sun, Z., Yue, H., Ye, H., Wild animal survey using UAS imagery and deep learning: modified faster R-CNN for kiang detection in tibetan plateau. ISPRS J. Photogramm. Remote Sens. 169 (2020), 364–376.
Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al. Learning transferable visual models from natural language supervision. International Conference on Machine Learning, 2021, PmLR, 8748–8763.
Rajput, L., Tyagi, N., Tyagi, S., Tyagi, D.K., State of the art object detection: A comparative study of YOLO and ViT. 2024 International Conference on Intelligent Systems for Cybersecurity, ISCS, 2024, IEEE, 01–06.
Rančić, K., Blagojević, B., Bezdan, A., Ivošević, B., Tubić, B., Vranešević, M., Pejak, B., Crnojević, V., Marko, O., Animal detection and counting from UAV images using convolutional neural networks. Drones, 7(3), 2023, 179.
Redmon, J., Divvala, S., Girshick, R., Farhadi, A., 2016. You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 779–788.
Ren, S., Luzi, F., Lahrichi, S., Kassaw, K., Collins, L.M., Bradbury, K., Malof, J.M., 2024. Segment anything, from space?. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. pp. 8355–8365.
Rey, N., Volpi, M., Joost, S., Tuia, D., Detecting animals in african savanna with UAVs and the crowds. Remote Sens. Environ. 200 (2017), 341–351.
Ribera, J., Guera, D., Chen, Y., Delp, E.J., 2019. Locating objects without bounding boxes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 6479–6489.
Saleh, K., Vámossy, Z., BBBD: Bounding box based detector for occlusion detection and order recovery. 2022 arXiv preprint arXiv:2204.12841.
Schneider, D., Lindner, K., Vogelbacher, M., Bellafkir, H., Farwig, N., Freisleben, B., Recognition of European mammals and birds in camera trap images using deep neural networks. IET Comput. Vis. 18:8 (2024), 1162–1192.
Seymour, A., Dale, J., Hammill, M., Halpin, P., Johnston, D., Automated detection and enumeration of marine wildlife using unmanned aircraft systems (UAS) and thermal imagery. Sci. Rep., 7(1), 2017, 45127.
Sharma, N., Scully-Power, P., Blumenstein, M., Shark detection from aerial imagery using region-based CNN, a study. AI 2018: Advances in Artificial Intelligence: 31st Australasian Joint Conference, Wellington, New Zealand, December 11-14, 2018, Proceedings 31, 2018, Springer, 224–236.
Soares, V.H.A., Ponti, M.A., Campello, R.J., Multi-attribute, graph-based approach for duplicate cattle removal and counting in large pasture areas from multiple aerial images. Comput. Electron. Agric., 220, 2024, 108828.
Song, Q., Wang, C., Jiang, Z., Wang, Y., Tai, Y., Wang, C., Li, J., Huang, F., Wu, Y., 2021. Rethinking counting and localization in crowds: A purely point-based framework. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 3365–3374.
Stapleton, S., Peacock, E., Garshelis, D., Aerial surveys suggest long-term stability in the seasonally ice-free foxe basin (nunavut) polar bear population. Mar. Mam. Sci. 32:1 (2016), 181–201.
Tuia, D., Kellenberger, B., Beery, S., Costelloe, B.R., Zuffi, S., Risse, B., Mathis, A., Mathis, M.W., Van Langevelde, F., Burghardt, T., et al. Perspectives in machine learning for wildlife conservation. Nat. Commun. 13:1 (2022), 1–15.
Wang, X., Yan, C., Li, X., Wang, Q., Cui, P., 0000. Comparative evaluation of yolo and rt-detr models for real-time defect detection in wood-based 3d printing, Available at SSRN 5252643.
Weiser, E.L., Flint, P.L., Marks, D.K.S., Brad, S.W., Heather, M.T., Sarah, J.F., Julian, B., Counts of Birds in Aerial Photos from Fall Waterfowl Surveys, Izembek Lagoon, Alaska, 2017–2019. 2022, US Geological Survey, Alaska Science Center.
Wu, Z., Zhang, C., Gu, X., Duporge, I., Hughey, L.F., Stabach, J.A., Skidmore, A.K., Hopcraft, J.G.C., Lee, S.J., Atkinson, P.M., et al. Deep learning enables satellite-based monitoring of large populations of terrestrial mammals across heterogeneous landscape. Nat. Commun., 14(1), 2023, 3072.
Xu, C., Liang, D., Xu, Y., Bai, S., Zhan, W., Bai, X., Tomizuka, M., Autoscale: Learning to scale for crowd counting. Int. J. Comput. Vis. 130:2 (2022), 405–434.
Xu, Z., Wang, T., Skidmore, A.K., Lamprey, R., A review of deep learning techniques for detecting animals in aerial and satellite images. Int. J. Appl. Earth Obs. Geoinf., 128, 2024, 103732.
Ye, Q., Ma, M., Zhao, X., Duan, B., Wang, L., Ma, D., ADD-YOLO: An algorithm for detecting animals in outdoor environments based on unmanned aerial imagery. Measurement, 242, 2025, 116019.
Yu, X., Chen, P., Wu, D., Hassan, N., Li, G., Yan, J., Shi, H., Ye, Q., Han, Z., 2022. Object localization under single coarse point supervision. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4868–4877.
Yu, F., Wang, D., Shelhamer, E., Darrell, T., 2018. Deep layer aggregation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2403–2412.
Zhang, H., Gu, D., Deep multi-task learning for animal chest circumference estimation from monocular images. Cogn. Comput. 16:3 (2024), 1092–1102.
Zhang, Y., Zhao, S., Gu, H., Mazurowski, M.A., How to efficiently annotate images for best-performing deep learning-based segmentation models: An empirical study with weak and noisy annotations and segment anything model. J. Imaging Inform. Med., 2025, 1–13.
Zhu, P., Peng, T., Du, D., Yu, H., Zhang, L., Hu, Q., Graph regularized flow attention network for video animal counting from drones. IEEE Trans. Image Process. 30 (2021), 5339–5351.