FoMo: Multi-Modal, Multi-Scale and Multi-Task Remote Sensing Foundation Models for Forest Monitoring

📅 2023-12-15
📈 Citations: 10
Influential: 0
📄 PDF
🤖 AI Summary
Existing remote sensing models struggle to jointly model heterogeneous, multi-source forest monitoring data (e.g., satellite, aerial, LiDAR, SAR) across diverse tasks (classification, segmentation, detection). Method: This paper introduces FoMo— the first multimodal, multi-scale, multi-task foundation model suite for forest monitoring—comprising the unified benchmark FoMo-Bench (15 datasets), the global fine-grained tree species dataset TalloS (>1,000 classes), and the flexible pretraining framework FoMo-Net. FoMo-Net supports arbitrary modality combinations via novel components: multimodal feature fusion, cross-resolution alignment, remote sensing spectral adaptive encoding, and multi-task decoupled heads. Contribution/Results: On FoMo-Bench, FoMo achieves a 12.7% absolute accuracy gain over single-modality state-of-the-art methods in tree species classification, with significantly improved generalization across geographic regions and sensor platforms.
📝 Abstract
Forests are vital to ecosystems, supporting biodiversity and essential services, but are rapidly changing due to land use and climate change. Understanding and mitigating negative effects requires parsing data on forests at global scale from a broad array of sensory modalities, and using them in diverse forest monitoring applications. Such diversity in data and applications can be effectively addressed through the development of a large, pre-trained foundation model that serves as a versatile base for various downstream tasks. However, remote sensing modalities, which are an excellent fit for several forest management tasks, are particularly challenging considering the variation in environmental conditions, object scales, image acquisition modes, spatio-temporal resolutions, etc. With that in mind, we present the first unified Forest Monitoring Benchmark (FoMo-Bench), carefully constructed to evaluate foundation models with such flexibility. FoMo-Bench consists of 15 diverse datasets encompassing satellite, aerial, and inventory data, covering a variety of geographical regions, and including multispectral, red-green-blue, synthetic aperture radar and LiDAR data with various temporal, spatial and spectral resolutions. FoMo-Bench includes multiple types of forest-monitoring tasks, spanning classification, segmentation, and object detection. To enhance task and geographic diversity in FoMo-Bench, we introduce TalloS, a global dataset combining satellite imagery with ground-based annotations for tree species classification across 1,000+ categories and hierarchical taxonomic levels. Finally, we propose FoMo-Net, a pre-training framework to develop foundation models with the capacity to process any combination of commonly used modalities and spectral bands in remote sensing.
Problem

Research questions and friction points this paper is trying to address.

Develop a foundation model for forest monitoring
Address diverse remote sensing data challenges
Evaluate model with unified forest monitoring benchmark
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-modal remote sensing foundation model
Global forest monitoring benchmark FoMo-Bench
Pre-training framework FoMo-Net
🔎 Similar Papers
No similar papers found.
N
N. Bountos
Mila Quebec AI Institute, Orion Lab, National Observatory of Athens & National Technical University of Athens, Harokopio University of Athens
Arthur Ouaknine
Arthur Ouaknine
McGill University, Mila
deep learningmachine learningsignal processingcomputer vision
Ioannis Papoutsis
Ioannis Papoutsis
National Technical University of Athens; National Observatory of Athens
Earth ObservationSAR InterferometryDeep/Machine learning
D
D. Rolnick
Mila Quebec AI Institute, McGill University