The SA-FARI Dataset: Segment Anything in Footage of Animals for Recognition and Identification

📅 2025-11-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current wildlife multi-object tracking (MAT) datasets suffer from limited scale, low species diversity, and insufficient spatiotemporal coverage, hindering the development of generalizable models. To address this, we introduce WildTrack-99—the first large-scale, high-diversity wildlife MAT benchmark—comprising 46 hours of camera-trap video spanning 99 species, with over 940,000 bounding box annotations and 16,000+ instance segmentation masks; geographic metadata is anonymized to preserve privacy while enabling cross-regional generalization. This work uniquely unifies species diversity, broad geographic coverage, and high-fidelity spatiotemporal annotation. Leveraging WildTrack-99, we conduct a systematic evaluation of state-of-the-art vision-language models (e.g., SAM 3) and pure vision-based methods across detection, tracking, and individual re-identification tasks. Our benchmark establishes a reproducible foundation for behavioral analysis and population monitoring in ecological conservation.

Technology Category

Application Category

📝 Abstract
Automated video analysis is critical for wildlife conservation. A foundational task in this domain is multi-animal tracking (MAT), which underpins applications such as individual re-identification and behavior recognition. However, existing datasets are limited in scale, constrained to a few species, or lack sufficient temporal and geographical diversity - leaving no suitable benchmark for training general-purpose MAT models applicable across wild animal populations. To address this, we introduce SA-FARI, the largest open-source MAT dataset for wild animals. It comprises 11,609 camera trap videos collected over approximately 10 years (2014-2024) from 741 locations across 4 continents, spanning 99 species categories. Each video is exhaustively annotated culminating in ~46 hours of densely annotated footage containing 16,224 masklet identities and 942,702 individual bounding boxes, segmentation masks, and species labels. Alongside the task-specific annotations, we publish anonymized camera trap locations for each video. Finally, we present comprehensive benchmarks on SA-FARI using state-of-the-art vision-language models for detection and tracking, including SAM 3, evaluated with both species-specific and generic animal prompts. We also compare against vision-only methods developed specifically for wildlife analysis. SA-FARI is the first large-scale dataset to combine high species diversity, multi-region coverage, and high-quality spatio-temporal annotations, offering a new foundation for advancing generalizable multianimal tracking in the wild. The dataset is available at $href{https://www.conservationxlabs.com/sa-fari}{ ext{conservationxlabs.com/SA-FARI}}$.
Problem

Research questions and friction points this paper is trying to address.

Existing datasets lack scale and diversity for wildlife tracking
No suitable benchmark exists for general-purpose multi-animal tracking models
Current datasets are constrained to few species and limited regions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Largest open-source multi-animal tracking dataset
Combines species diversity with spatio-temporal annotations
Benchmarks vision-language models using species-specific prompts
🔎 Similar Papers
No similar papers found.
D
D. Wasmuht
Conservation X Labs (CXL)
Otto Brookes
Otto Brookes
Computer Vision PhD Candidate, University of Bristol
Animal BiometricsAI for ConservationImageomics
M
Maximillian Schall
Hasso Plattner Institute
P
Pablo Palencia
University of Oviedo
C
Christopher Beirne
Osa Conservation
T
T. Burghardt
University of Bristol
Majid Mirmehdi
Majid Mirmehdi
Professor of Computer Vision, FIAPR, FBMVA, University of Bristol
Computer Vision and Pattern Recognition
H
Hjalmar Kuhl
Senckenberg Museum of Natural History
M
M. Arandjelovic
Max Planck Institute for Evolutionary Anthropology
S
Sam Pottie
Climate Corridors
P
Peter Bermant
Conservation X Labs (CXL)
B
Brandon Asheim
Conservation X Labs (CXL)
Y
Yi Jin Toh
Conservation X Labs (CXL)
A
Adam Elzinga
Conservation X Labs (CXL)
J
Jason Holmberg
Conservation X Labs (CXL)
A
Andrew Whitworth
Osa Conservation
E
Eleanor Flatt
Osa Conservation
Laura Gustafson
Laura Gustafson
Facebook AI Research
Artificial Intelligence
Chaitanya K. Ryali
Chaitanya K. Ryali
Meta
Yuan-Ting Hu
Yuan-Ting Hu
Research Scientist, FAIR, Meta AI
computer visionmachine learning
Baishan Guo
Baishan Guo
Meta AI
A
Andrew Westbury
Meta
Kate Saenko
Kate Saenko
Boston University
Computer VisionMachine LearningArtificial IntelligenceDomain AdaptationVision and Language
Dídac Surís
Dídac Surís
Meta