SelvaBox: A high-resolution dataset for tropical tree crown detection

📅 2025-06-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Tropical forest individual-tree crown detection faces challenges including highly variable crown morphology, severe occlusion, and scarcity of high-quality annotated data. To address these, we introduce SelvaBox—the first large-scale, open tropical crown detection dataset—spanning three countries and comprising over 83,000 high-precision manually annotated crowns at 3–10 cm/pixel spatial resolution, exceeding the combined size of all prior datasets. We propose a multi-resolution joint training framework, empirically demonstrating substantial accuracy gains from high-resolution inputs and achieving strong zero-shot cross-dataset generalization. Our model ranks among the top two on multiple benchmarks. We publicly release the dataset, source code, and pre-trained models, establishing a new benchmark and practical toolkit for tropical forest remote sensing analysis.

Technology Category

Application Category

📝 Abstract
Detecting individual tree crowns in tropical forests is essential to study these complex and crucial ecosystems impacted by human interventions and climate change. However, tropical crowns vary widely in size, structure, and pattern and are largely overlapping and intertwined, requiring advanced remote sensing methods applied to high-resolution imagery. Despite growing interest in tropical tree crown detection, annotated datasets remain scarce, hindering robust model development. We introduce SelvaBox, the largest open-access dataset for tropical tree crown detection in high-resolution drone imagery. It spans three countries and contains more than 83,000 manually labeled crowns - an order of magnitude larger than all previous tropical forest datasets combined. Extensive benchmarks on SelvaBox reveal two key findings: (1) higher-resolution inputs consistently boost detection accuracy; and (2) models trained exclusively on SelvaBox achieve competitive zero-shot detection performance on unseen tropical tree crown datasets, matching or exceeding competing methods. Furthermore, jointly training on SelvaBox and three other datasets at resolutions from 3 to 10 cm per pixel within a unified multi-resolution pipeline yields a detector ranking first or second across all evaluated datasets. Our dataset, code, and pre-trained weights are made public.
Problem

Research questions and friction points this paper is trying to address.

Detecting diverse tropical tree crowns in complex ecosystems
Addressing scarcity of annotated datasets for robust model training
Improving detection accuracy with high-resolution drone imagery
Innovation

Methods, ideas, or system contributions that make the work stand out.

High-resolution drone imagery dataset
Multi-resolution unified pipeline
Zero-shot competitive detection performance
H
Hugo Baudchon
Mila – Quebec AI Institute, Université de Montréal
Arthur Ouaknine
Arthur Ouaknine
McGill University, Mila
deep learningmachine learningsignal processingcomputer vision
M
Martin Weiss
Mila – Quebec AI Institute, Université de Montréal
Mélisande Teng
Mélisande Teng
Mila, Université de Montréal
T
Thomas R. Walla
Colorado Mesa University
A
Antoine Caron-Guay
Université de Montréal
C
Christopher Pal
Mila – Quebec AI Institute, Polytechnique Montreal
Etienne Laliberté
Etienne Laliberté
Université de Montréal
Plant ecologyFunctional ecologyRemote sensing