๐ค AI Summary
A critical gap exists in publicly available, synchronized multimodal wildlife monitoring datasets supporting endangered species research and habitat management. To address this, we introduce SmartWildsโthe first open-source, multimodal dataset featuring spatiotemporally aligned acquisitions from unmanned aerial vehicle (UAV) remote sensing, camera traps (images/videos), and bioacoustic recorders. We propose a reproducible multimodal monitoring protocol and systematically evaluate the complementary strengths of these sensor modalities for land-use classification, species detection, and behavioral recognition. A four-day field deployment captured diverse rare and native species across heterogeneous habitats, yielding benchmark data for seasonal dynamics analysis and individual-level tracking. This work bridges the multimodal conservation monitoring gap, advancing open science, cross-modal fusion analytics, and conservation-oriented computer vision.
๐ Abstract
We present the first release of SmartWilds, a multimodal wildlife monitoring dataset. SmartWilds is a synchronized collection of drone imagery, camera trap photographs and videos, and bioacoustic recordings collected during summer 2025 at The Wilds safari park in Ohio. This dataset supports multimodal AI research for comprehensive environmental monitoring, addressing critical needs in endangered species research, conservation ecology, and habitat management. Our pilot deployment captured four days of synchronized monitoring across three modalities in a 220-acre pasture containing Pere David's deer, Sichuan takin, Przewalski's horses, as well as species native to Ohio, including bald eagles, white-tailed deer, and coyotes. We provide a comparative analysis of sensor modality performance, demonstrating complementary strengths for landuse patterns, species detection, behavioral analysis, and habitat monitoring. This work establishes reproducible protocols for multimodal wildlife monitoring while contributing open datasets to advance conservation computer vision research. Future releases will include synchronized GPS tracking data from tagged individuals, citizen science data, and expanded temporal coverage across multiple seasons.