Where are the Whales: A Human-in-the-loop Detection Method for Identifying Whales in High-resolution Satellite Imagery

📅 2025-10-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high cost and poor scalability of cetacean monitoring in ultra-high-resolution satellite imagery, this paper proposes a semi-automated detection framework requiring no labeled training data. Methodologically, it integrates statistical anomaly detection to localize spatial outliers as candidate cetacean targets, and establishes a human-in-the-loop workflow supported by a web-based interactive interface for rapid expert validation. Its key contribution lies in abandoning the supervised learning paradigm: unsupervised anomaly detection drives initial screening, drastically reducing the scope of manual inspection. Evaluated across three benchmark scenarios, the framework achieves recall rates of 90.3%–96.4%, while shrinking the area requiring human verification by up to 99.8%—e.g., from over 1,000 km² to less than 2 km². This work establishes an efficient, scalable new paradigm for large-scale remote sensing monitoring of marine mammals.

Technology Category

Application Category

📝 Abstract
Effective monitoring of whale populations is critical for conservation, but traditional survey methods are expensive and difficult to scale. While prior work has shown that whales can be identified in very high-resolution (VHR) satellite imagery, large-scale automated detection remains challenging due to a lack of annotated imagery, variability in image quality and environmental conditions, and the cost of building robust machine learning pipelines over massive remote sensing archives. We present a semi-automated approach for surfacing possible whale detections in VHR imagery using a statistical anomaly detection method that flags spatial outliers, i.e. "interesting points". We pair this detector with a web-based labeling interface designed to enable experts to quickly annotate the interesting points. We evaluate our system on three benchmark scenes with known whale annotations and achieve recalls of 90.3% to 96.4%, while reducing the area requiring expert inspection by up to 99.8% -- from over 1,000 sq km to less than 2 sq km in some cases. Our method does not rely on labeled training data and offers a scalable first step toward future machine-assisted marine mammal monitoring from space. We have open sourced this pipeline at https://github.com/microsoft/whales.
Problem

Research questions and friction points this paper is trying to address.

Automated whale detection in satellite imagery faces data and scalability challenges
Traditional whale monitoring methods are costly and difficult to scale effectively
Current approaches struggle with limited annotations and environmental variability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Statistical anomaly detection for whale identification
Web-based labeling interface for expert annotation
Unsupervised method reducing inspection area by 99.8%
🔎 Similar Papers
No similar papers found.
Caleb Robinson
Caleb Robinson
Microsoft AI for Good
computational sustainabilitydeep learninghuman migration
K
Kimberly T. Goetz
Marine Mammal Laboratory, Alaska Fisheries Science Center, National Marine Fisheries Service, NOAA, Seattle, Washington, USA
C
Christin B. Khan
Northeast Fisheries Science Center, National Marine Fisheries Service, NOAA, Woods Hole, Massachusetts, USA
M
Meredith Sackett
Azura Consulting, under contract to NOAA Fisheries, Northeast Fisheries Science Center, Woods Hole, MA USA
K
Kathleen Leonard
Protected Resources Division, Alaska Regional Office, National Marine Fisheries Service, NOAA, Anchorage, AK
Rahul Dodhia
Rahul Dodhia
Deputy Director, AI for Good Research Lab, Microsoft
generative aiartificial intelligencestatisticscomputer visiongeospatial imagery
Juan M. Lavista Ferres
Juan M. Lavista Ferres
Chief Scientist and Lab Director, Microsoft AI for Good Research Lab
Medical ImagingDeep LearningCausalityMachine LearningData Science