GorillaWatch: An Automated System for In-the-Wild Gorilla Re-Identification and Population Monitoring

📅 2025-12-08

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

Individual re-identification of wild western lowland gorillas relies heavily on manual annotation and suffers from a critical lack of large-scale, in-the-wild video datasets. Method: We propose an end-to-end automated monitoring system featuring (i) a multi-frame self-supervised pretraining strategy leveraging trajectory consistency to learn domain-specific representations; (ii) differentiable AttnLRP for visualizing and validating model attention on biologically relevant features rather than background artifacts; and (iii) a spatiotemporally constrained clustering algorithm to mitigate over-segmentation and enhance robustness in unsupervised population counting. Contribution/Results: We release the largest wild primate re-identification video dataset to date—including three newly collected datasets. Experiments demonstrate that aggregating features from image-based backbone networks outperforms dedicated video architectures. Our system significantly improves individual identification accuracy and population tracking efficiency in real-world field conditions.

Technology Category

Application Category

📝 Abstract

Monitoring critically endangered western lowland gorillas is currently hampered by the immense manual effort required to re-identify individuals from vast archives of camera trap footage. The primary obstacle to automating this process has been the lack of large-scale, "in-the-wild" video datasets suitable for training robust deep learning models. To address this gap, we introduce a comprehensive benchmark with three novel datasets: Gorilla-SPAC-Wild, the largest video dataset for wild primate re-identification to date; Gorilla-Berlin-Zoo, for assessing cross-domain re-identification generalization; and Gorilla-SPAC-MoT, for evaluating multi-object tracking in camera trap footage. Building on these datasets, we present GorillaWatch, an end-to-end pipeline integrating detection, tracking, and re-identification. To exploit temporal information, we introduce a multi-frame self-supervised pretraining strategy that leverages consistency in tracklets to learn domain-specific features without manual labels. To ensure scientific validity, a differentiable adaptation of AttnLRP verifies that our model relies on discriminative biometric traits rather than background correlations. Extensive benchmarking subsequently demonstrates that aggregating features from large-scale image backbones outperforms specialized video architectures. Finally, we address unsupervised population counting by integrating spatiotemporal constraints into standard clustering to mitigate over-segmentation. We publicly release all code and datasets to facilitate scalable, non-invasive monitoring of endangered species

Problem

Research questions and friction points this paper is trying to address.

Automates gorilla re-identification from camera trap footage

Addresses lack of large-scale wild video datasets for training

Enables unsupervised population counting through spatiotemporal clustering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-frame self-supervised pretraining for domain-specific features

Differentiable AttnLRP verifies reliance on biometric traits

Spatiotemporal constraints integrated into clustering for population counting

🔎 Similar Papers

No similar papers found.