data augmentation; few-shot learning; muskox; synthetic images; wildlife survey; zero-shot learning; Ecology, Evolution, Behavior and Systematics; Ecology; Computers in Earth Sciences; Nature and Landscape Conservation; Computer Science - Computer Vision and Pattern Recognition
Abstract :
[en] Accurate population estimates are essential for wildlife management, providing critical insights into species abundance and distribution. Traditional survey methods, including visual aerial counts and GNSS telemetry tracking, are widely used to monitor muskox (Ovibos moschatus) populations in Arctic regions. These approaches are resource-intensive and constrained by logistical challenges. Advances in remote sensing, artificial intelligence, and high-resolution aerial imagery offer promising alternatives for wildlife detection. Yet, the effectiveness of deep learning object detection models (ODMs) is often limited by small datasets, making it challenging to train robust ODMs for sparsely distributed species like muskoxen. This study investigates the integration of synthetic imagery, created with diffusion-based models, to supplement limited training data and improve muskox detection in zero-shot and few-shot settings. We compared a baseline model trained solely on real imagery with five zero-shot (ZS1–ZS5) and five few-shot (FS1–FS5) models that incorporated progressively more synthetic imagery in the training set. For the zero-shot models, where no real images were included in the training set, adding synthetic imagery improved detection performance. As more synthetic images were added, performance in precision, recall, and F1 score increased, but eventually plateaued, suggesting diminishing returns when synthetic images exceeded 100% of the baseline model training dataset. For few-shot models, combining real and synthetic images led to better recall and slightly higher overall accuracy compared with using real images alone, though these improvements were not statistically significant. Our findings demonstrate the potential of synthetic images to train accurate ODMs when data are scarce, offering important perspectives for wildlife monitoring by enabling rare or inaccessible species to be monitored and to increase monitoring frequency. This approach could be used to initiate ODMs without real data and refine it as real images are acquired over time.
Disciplines :
Environmental sciences & ecology Computer science
Author, co-author :
Durand, Simon; Department of Applied Geomatics, Université de Sherbrooke, Sherbrooke, Canada ; Quebec Centre for Biodiversity Science (QCBS), Stewart Biology, McGill University, Montréal, Canada
Foucher, Samuel ; Department of Applied Geomatics, Université de Sherbrooke, Sherbrooke, Canada
Delplanque, Alexandre ; Université de Liège - ULiège > Département GxABT > Gestion des ressources forestières
Taillon, Joëlle; Direction générale de la gestion de la faune, Ministère de l'Environnement, de la Lutte contre les Changements Climatiques, de la Faune et des Parcs (MELCCFP), Québec, Canada
Théau, Jérôme ; Department of Applied Geomatics, Université de Sherbrooke, Sherbrooke, Canada ; Quebec Centre for Biodiversity Science (QCBS), Stewart Biology, McGill University, Montréal, Canada
Language :
English
Title :
Lacking data? No worries! How synthetic images can alleviate image scarcity in wildlife surveys: A case study with muskox (Ovibos moschatus)
Abu Alhaija, H., Mustikovela, S.K., Mescheder, L., Geiger, A. & Rother, C. (2018) Augmented reality meets computer vision: efficient data generation for urban driving scenes. International Journal of Computer Vision, 126(9), 961–972. https://doi.org/10.1007/s11263-018-1070-x
Adamczewski, J., Olesen, K., Olesen, D., Williams, J., Cluff, D., & Boulanger, J. (2021) Late winter 2018 muskox photo composition survey, East Arm of Great Slave Lake (manuscript report no. 296). Government of Northwest Territories, Environment and Natural Resources, 41 pp.
Alaska Department of Fish and Game (2001) Muskox management report of survey-inventory activities 1 July 1998–30 June 2000 (Management Report). Juneau, Alaska, United States of America, 55 pp.
AMAP. (2017) Snow, Water, Ice and Permafrost in the Arctic (SWIPA) 2017. Arctic Monitoring and Assessment Programme (AMAP), Oslo, Norway, pp. xiv–269.
Anderson, M. (2016) Distribution and abundance of muskoxen (Ovibos moschatus) and Peary caribou (Rangifer tarandus pearyi) on Prince of Wales, Somerset, and Russell Islands, August 2016 (Status Report No. 2016–06). Government of Nunavut, Department of Environment, Wildlife Research Section, Igloolik, Nunavut, Canada, pp. v + 22.
Antonelli, S., Avola, D., Cinque, L., Crisostomi, D., Foresti, G.L., Galasso, F. et al. (2022) Few-shot object detection: a survey. ACM Computing Surveys, 54(11s), 37. https://doi.org/10.1145/3519022
Beery, S., Liu, Y., Morris, D., Piavis, J., Kapoor, A., Meister, M., et al. (2020) Synthetic examples improve generalization for rare classes, in: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). Presented at the 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE, Snowmass Village, CO, USA, pp. 852–862. https://doi.org/10.1109/WACV45572.2020.9093570.
Brodeur, A., Leblond, M., Brodeur, V., Taillon, J. & Côté, S.D. (2023) Investigating potential for competition between migratory caribou and introduced muskoxen. The Journal of Wildlife Management, 87(3), e22366. https://doi.org/10.1002/jwmg.22366
Buckland, S.T., Anderson, D.R., Burnham, K.P., Laake, J.L., Borchers, D.L. & Thomas, L. (2001) Introduction to distance sampling: estimating abundance of biological populations, 1st edition. Oxford, Toronto, Canada: Oxford University Press, p. 448.
Buslaev, A., Iglovikov, V.I., Khvedchenya, E., Parinov, A., Druzhinin, M. & Kalinin, A.A. (2020) Albumentations: fast and flexible image augmentations. Information, 11(2), 125. https://doi.org/10.3390/info11020125
Caughley, G. (1974) Bias in aerial survey. The Journal of Wildlife Management, 38(4), 921. https://doi.org/10.2307/3800067
Chen, Z., Sun, W., Wu, H., Zhang, Z., Jia, J., Ji, Z., et al. (2023) Exploring the naturalness of AI-generated images. arXiv preprint, 33 pp. https://doi.org/10.48550/ARXIV.2312.05476.
Corvi, R., Cozzolino, D., Zingarini, G., Poggi, G., Nagano, K., & Verdoliva, L. (2023) On the detection of synthetic images generated by diffusion models. In: ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Presented at the ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, Rhodes Island, Greece, p. 5. https://doi.org/10.1109/ICASSP49357.2023.10095167.
Couturier, S., Dale, A., Wood, B. & Snook, J. (2018) Results of a spring 2017 aerial survey of the Torngat Mountains Caribou Herd. Happy Valley-Goose Bay, Newfoundland and Labrador, Canada: Torngat Wildlife, Plants and Fisheries Secretariat. pp. xiv + 50.
Cuyler, C., Rowell, J., Adamczewski, J., Anderson, M., Blake, J., Bretten, T. et al. (2020) Muskox status, recent variation, and uncertain future. Ambio, 49(3), 805–819. https://doi.org/10.1007/s13280-019-01205-x
Davison, T. & Williams, J. (2022) Aerial survey of muskoxen (Ovibos moschatus) and Peary caribou (Rangifer tarandus pearyi) on Northwest Victoria Island, May 2019 (manuscript report No. 303). Government of Northwest Territories, Environment and Natural Resources, Yellowknife, Canada, p. 17.
Delplanque, A., Foucher, S., Théau, J., Bussière, E., Vermeulen, C. & Lejeune, P. (2023) From crowd to herd counting: how to precisely detect and count African mammals using aerial imagery and deep learning? ISPRS Journal of Photogrammetry and Remote Sensing, 197, 167–180. https://doi.org/10.1016/j.isprsjprs.2023.01.025
Delplanque, A., Lamprey, R., Foucher, S., Théau, J. & Lejeune, P. (2023) Surveying wildlife and livestock in Uganda with aerial cameras: deep learning reduces the workload of human interpretation by over 70%. Frontiers in Ecology and Evolution, 11, 9. https://doi.org/10.3389/fevo.2023.1270857
Delplanque, A., Linchant, J., Vincke, X., Lamprey, R., Théau, J., Vermeulen, C. et al. (2024) Will artificial intelligence revolutionize aerial surveys? A first large-scale semi-automated survey of African wildlife using oblique imagery and deep learning. Ecological Informatics, 82, 102679. https://doi.org/10.1016/j.ecoinf.2024.102679
Delplanque, A., Théau, J., Foucher, S., Serati, G., Durand, S. & Lejeune, P. (2024) Wildlife detection, counting and survey using satellite imagery: are we there yet? GIScience & Remote Sensing, 61(1), 30. https://doi.org/10.1080/15481603.2024.2348863
Dhariwal, P. & Nichol, A. (2021) Diffusion models beat GANs on image synthesis arXiv preprint, 44 pp. https://doi.org/10.48550/ARXIV.2105.05233
Eikelboom, J.A.J., Wind, J., Van De Ven, E., Kenana, L.M., Schroder, B., De Knegt, H.J. et al. (2019) Improving the precision and accuracy of animal population estimates with aerial image object detection. Methods in Ecology and Evolution, 10(11), 1875–1887. https://doi.org/10.1111/2041-210X.13277
Environment and Climate Change Canada. (2022) Guidance and protocols for wildlife surveys for emergency response. Environment and Climate Change Canada, Gatineau, Quebec, Canada. pp. x + 97.
Fleming, P.J.S. & Tracey, J.P. (2008) Some human, aircraft and animal factors affecting aerial surveys: how to enumerate animals from the air. Wildlife Research, 35(4), 258–267. https://doi.org/10.1071/WR07081
Giray, L. (2023) Prompt engineering with ChatGPT: a guide for academic writers. Annals of Biomedical Engineering, 51(12), 2629–2633. https://doi.org/10.1007/s10439-023-03272-4
Gonzalez, L.F., Montes, G.A., Puig, E., Johnson, S., Mengersen, K. & Gaston, K.J. (2016) Unmanned aerial vehicles (UAVs) and artificial intelligence revolutionizing wildlife monitoring and conservation. Sensors, 16(1), 97. https://doi.org/10.3390/s16010097
Gunn, A. & Adamczewski, J. (2003) Muskox: Ovibos moschatus. In: Wild mammals of North America: biology, management, and conservation. Baltimore, MD: Johns Hopkins University Press, pp. 1076–1094.
Hao, Y., Chi, Z., Dong, L., & Wei, F. (2022). Optimizing prompts for text-to-image generation. arXiv preprint, 16 pp. https://doi.org/10.48550/ARXIV.2212.09611.
He, R., Sun, S., Yu, X., Xue, C., Zhang, W., Torr, P., et al. (2022) Is synthetic data from generative models ready for image recognition? arXiv preprint, 24 pp. https://doi.org/10.48550/ARXIV.2210.07574.
Hodgson, J.C., Mott, R., Baylis, S.M., Pham, T.T., Wotherspoon, S., Kilpatrick, A.D. et al. (2018) Drones count wildlife more accurately and precisely than humans. Methods in Ecology and Evolution, 9(5), 1160–1167. https://doi.org/10.1111/2041-210x.12974
Huang, G., Laradji, I., Vazquez, D., Lacoste-Julien, S. & Rodriguez, P. (2023) A survey of self-supervised and few-shot object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4), 4071–4089. https://doi.org/10.1109/TPAMI.2022.3199617
Imagen 3 Team, Google (2024) Imagen 3. arXiv preprint, 35 pp. https://doi.org/10.48550/ARXIV.2408.07009.
Khanna, S., Liu, P., Zhou, L., Meng, C., Rombach, R., Burke, M., et al. (2023) DiffusionSat: a generative foundation model for satellite imagery. arXiv preprint, 19 pp. https://doi.org/10.48550/ARXIV.2312.03606.
Kutz, S., Rowell, J., Adamczewski, J., Gunn, A., Cuyler, C., Aleuy, O.A. et al. (2017) Muskox health ecology symposium 2016: gathering to share knowledge on Umingmak in a time of rapid change. Arctic, 70(2), 225–236. https://doi.org/10.14430/arctic4656
Lamprey, R., Pope, F., Ngene, S., Norton-Griffiths, M., Frederick, H., Okita-Ouma, B. et al. (2020) Comparing an automated high-definition oblique camera system to rear-seat-observers in a wildlife survey in Tsavo, Kenya: taking multi-species aerial counts to the next level. Biological Conservation, 241, 108243. https://doi.org/10.1016/j.biocon.2019.108243
Lancia, R.A., Kendall, W.L., Pollock, K.H. & Nichols, J.D. (2005) Estimating the number of animals in wildlife populations. In: Research and management techniques for wildlife and habitats. Bethesda, MD: The Wildlife Society, pp. 106–153.
Le Moullec, M., Pedersen, Å.Ø., Yoccoz, N.G., Aanes, R., Tufto, J. & Hansen, B.B. (2017) Ungulate population monitoring in an open tundra landscape: distance sampling versus total counts. Wildlife Biology, 2017(1), 1–11. https://doi.org/10.2981/wlb.00299
Leboeuf, A., Morneau, C., Robitaille, A., Dufour, E., & Grondin, P. (2018) Ecological mapping of the vegetation of northern Québec—Mapping standard Direction des inventaires forestiers, Ministère des Forêts, de la Faune et des Parcs, Quebec, Canada. pp. III + 17.
Leivada, E., Murphy, E. & Marcus, G. (2023) DALL·E 2 fails to reliably capture common syntactic processes. Social Sciences & Humanities Open, 8, 10. https://doi.org/10.1016/j.ssaho.2023.100648
Lemay, M., Provencher-Nolet, L., Bernier, M., Lévesque, E. & Boudreau, S. (2018) Spatially explicit modeling and prediction of shrub cover increase near Umiujaq, Nunavik. Ecological Monographs, 88(3), 385–407. https://doi.org/10.1002/ecm.1296
Lin, S., Wang, K., Zeng, X., & Zhao, R. (2023) Explore the power of synthetic data on few-shot object detection, in: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Presented at the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, Vancouver, British Columbia, Canada, pp. 638–647. https://doi.org/10.1109/CVPRW59228.2023.00071.
Linchant, J., Lisein, J., Semeki, J., Lejeune, P. & Vermeulen, C. (2015) Are unmanned aircraft systems (UASs) the future of wildlife monitoring? A review of accomplishments and challenges. Mammal Review, 45(4), 239–252. https://doi.org/10.1111/mam.12046
McComb, B., Zuckerberg, B., Vesely, D. & Jordan, C. (2010) Monitoring animal populations and their habitats: a practitioner's guide, 1st edition. Boca Raton, FL: CRC Press, pp. xi–452. https://doi.org/10.1201/9781420070583
Meena, K.B. & Tyagi, V. (2019) Methods to distinguish photorealistic computer generated images from photographic images: a review. In: Singh, M., Gupta, P.K., Tyagi, V., Flusser, J., Ören, T. & Kashyap, R. (Eds.) Advances in computing and data sciences. ICACDS 2019, Communications in Computer and Information Science. Singapore: Springer, pp. 64–82. https://doi.org/10.1007/978-981-13-9939-8_7
Meena, K.B. & Tyagi, V. (2021) Distinguishing computer-generated images from photographic images using two-stream convolutional neural network. Applied Soft Computing, 100, 10. https://doi.org/10.1016/j.asoc.2020.107025
Mehmood, R., Bashir, R. & Giri, K.J. (2024) Text conditioned generative adversarial networks generating images and videos: a critical review. SN Computer Science, 5, 935. https://doi.org/10.1007/s42979-024-03289-z
Midjourney (2023) Midjourney [WWW Document]. https://www.midjourney.com/home [Accessed 6th October 2024].
Newey, S., Davidson, P., Nazir, S., Fairhurst, G., Verdicchio, F., Irvine, R.J. et al. (2015) Limitations of recreational camera traps for wildlife management and conservation research: a practitioner's perspective. Ambio, 44(S4), 624–635. https://doi.org/10.1007/s13280-015-0713-1
Nguyen, H.L., Le, D.T. & Hoang, H.H. (2024) Application of synthetic data on object detection tasks. Engineering, Technology & Applied Science Research, 14(4), 15695–15699. https://doi.org/10.48084/etasr.7929
Nirkin, Y., Wolf, L., Keller, Y. & Hassner, T. (2022) Deepfake detection based on discrepancies between faces and their context. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10), 6111–6121. https://doi.org/10.1109/TPAMI.2021.3093446
Peng, Q., Lu, Y., Peng, Y., Qian, S., Liu, X., & Shen, C. (2024) Crafting synthetic realities: examining visual realism and misinformation potential of photorealistic AI-generated images. arXiv preprint, 13 pp. https://doi.org/10.48550/ARXIV.2409.17484.
Piper, L. (2016) Great Slave Lake [WWW Document]. The Canadian Encyclopedia. https://www.thecanadianencyclopedia.ca/fr/article/grand-lac-des-esclaves [Accessed 15th May 2022].
Prosekov, A., Kuznetsov, A., Rada, A. & Ivanova, S. (2020) Methods for monitoring large terrestrial animals in the wild. Forests, 11(8), 808. https://doi.org/10.3390/f11080808
Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., & Chen, M. (2022) Hierarchical text-conditional image generation with CLIP latents. arXiv preprint, 27 pp. https://doi.org/10.48550/ARXIV.2204.06125.
Schlossberg, S., Chase, M.J. & Griffin, C.R. (2016) Testing the accuracy of aerial surveys for large mammals: an experiment with African savanna elephants (Loxodonta africana). PLoS One, 11(10), 19. https://doi.org/10.1371/journal.pone.0164904
Schmidt, J.H., Thompson, W.L., Wilson, T.L. & Reynolds, J.H. (2022) Distance sampling surveys: using components of detection and total error to select among approaches. Wildlife Monographs, 210(1), 56. https://doi.org/10.1002/wmon.1070
Shafaei, A., Little, J., & Schmidt, M. (2016) Play and learn: using video games to train computer vision models. In: Proceedings of the British Machine Vision Conference 2016. Presented at the British Machine Vision Conference 2016, British Machine Vision Association, York, UK, p. 18. https://doi.org/10.5244/C.30.26.
Shin, J., Kang, M., & Park, J. (2023) Fill-up: balancing long-tailed data with generative models. arXiv preprint, 32 pp. https://doi.org/10.48550/ARXIV.2306.07200.
Sinclair, A.R.E., Fryxell, J.M. & Caughley, G. (2006) Wildlife ecology, conservation, and management, 2nd edition. Malden, MA: Blackwell Publishing, pp. xii + 469.
Stability AI (2024) Stable diffusion [WWW Document]. https://stability.ai/stable-image [Accessed 6th October 2024].
Tkachenko, M. Malyuk, M., Holmanyuk, A. & Liubimov, N. (2020) Label studio: data labeling software.
Tremblay, J., Prakash, A., Acuna, D., Brophy, M., Jampani, V., Anil, C., et al. (2018) Training deep networks with synthetic data: bridging the reality gap by domain randomization, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Presented at the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE Computer Society, Salt Lake City, UT, pp. 1082–10828. https://doi.org/10.1109/CVPRW.2018.00143
Verdoliva, L. (2020) Media forensics and deepfakes: an overview. IEEE Journal of Selected Topics in Signal Processing, 14(5), 910–932. https://doi.org/10.1109/JSTSP.2020.3002101
Walters, C.J. (1986) Adaptive management of renewable resources. In: Biological resource management. New York, NY: Macmillan Publishing Company, p. 374.
Wang, D., Shao, Q. & Yue, H. (2019) Surveying wild animals from satellites, manned aircraft and unmanned aerial systems (UASs): a review. Remote Sensing, 11(11), 1308. https://doi.org/10.3390/rs11111308
Wang, J., Liu, Z., Zhao, L., Wu, Z., Ma, C., Yu, S. et al. (2023) Review of large vision models and visual prompt engineering. Meta-Radiology, 1(3), 100047. https://doi.org/10.1016/j.metrad.2023.100047
Wang, Y., Yao, Q., Kwok, J.T. & Ni, L.M. (2021) Generalizing from a few examples: a survey on few-shot learning. ACM Computing Surveys, 53(3), 34. https://doi.org/10.1145/3386252
Williams, B.K., Nichols, J.D. & Conroy, M.J. (2002) Analysis and management of animal populations: modeling, estimation, and decision making, 1st edition. San Diego, CA: Academic Press, p. 817.
Witmer, G.W. (2005) Wildlife population monitoring: some practical considerations. Wildlife Research, 32(3), 259–263. https://doi.org/10.1071/WR04003
Yang, J., Guo, X., Li, Y., Marinello, F., Ercisli, S. & Zhang, Z. (2022) A survey of few-shot learning in smart agriculture: developments, applications, and challenges. Plant Methods, 18(28), 12. https://doi.org/10.1186/s13007-022-00866-2
Zhang, C., Zhang, C., Zhang, M., & Kweon, I.S. (2023) Text-to-image diffusion models in generative AI: a survey. arXiv preprint, 13 pp. https://doi.org/10.48550/ARXIV.2303.07909.
Zhang, Q., Yi, X., Guo, J., Tang, Y., Feng, T. & Liu, R. (2023) A few-shot rare wildlife image classification method based on style migration data augmentation. Ecological Informatics, 77, 102237. https://doi.org/10.1016/j.ecoinf.2023.102237
Zhou, Y., Muresanu, A.I., Han, Z., Paster, K., Pitis, S., Chan, H., & Ba, J. (2022) Large language models are human-level prompt engineers, in: arXiv preprint. Presented at the International Conference on Learning Representations (ICLR) 2023, p. 43. https://doi.org/10.48550/ARXIV.2211.01910.