🤖 AI Summary
Wild species such as muskoxen exhibit sparse spatial distributions, resulting in severe scarcity of ground-truth annotations for remote sensing surveys. Method: This study pioneers the systematic evaluation of synthetic images (SIs) for zero-shot (ZS) and few-shot (FS) object detection in wildlife monitoring. We propose a training paradigm that initializes deep detection models—e.g., Faster R-CNN—exclusively on high-fidelity SIs generated from high-resolution aerial imagery, followed by incremental fine-tuning with limited real-world annotations. Contribution/Results: In ZS settings, SI integration significantly improves precision, recall, and F1-score, with performance saturating beyond 100% SI volume relative to real data. In FS settings, recall increases markedly while overall accuracy exhibits steady improvement. This work establishes a scalable, cost-effective data augmentation pathway enabling frequent, low-cost remote sensing monitoring of rare and sparsely distributed wildlife.
📝 Abstract
Accurate population estimates are essential for wildlife management, providing critical insights into species abundance and distribution. Traditional survey methods, including visual aerial counts and GNSS telemetry tracking, are widely used to monitor muskox populations in Arctic regions. These approaches are resource intensive and constrained by logistical challenges. Advances in remote sensing, artificial intelligence, and high resolution aerial imagery offer promising alternatives for wildlife detection. Yet, the effectiveness of deep learning object detection models (ODMs) is often limited by small datasets, making it challenging to train robust ODMs for sparsely distributed species like muskoxen. This study investigates the integration of synthetic imagery (SI) to supplement limited training data and improve muskox detection in zero shot (ZS) and few-shot (FS) settings. We compared a baseline model trained on real imagery with 5 ZS and 5 FS models that incorporated progressively more SI in the training set. For the ZS models, where no real images were included in the training set, adding SI improved detection performance. As more SI were added, performance in precision, recall and F1 score increased, but eventually plateaued, suggesting diminishing returns when SI exceeded 100% of the baseline model training dataset. For FS models, combining real and SI led to better recall and slightly higher overall accuracy compared to using real images alone, though these improvements were not statistically significant. Our findings demonstrate the potential of SI to train accurate ODMs when data is scarce, offering important perspectives for wildlife monitoring by enabling rare or inaccessible species to be monitored and to increase monitoring frequency. This approach could be used to initiate ODMs without real data and refine it as real images are acquired over time.