PrediTree: A Multi-Temporal Sub-meter Dataset of Multi-Spectral Imagery Aligned With Canopy Height Maps

📅 2025-09-01

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

A lack of high-quality, multi-temporal, multispectral, and LiDAR-derived canopy height model (CHM) co-registered open datasets hinders sub-meter tree-height prediction. Method: We introduce PrediTree—the first open, sub-meter (0.5 m), multi-temporal, multispectral image–CHM paired dataset covering diverse forest ecosystems across France, comprising 3.14 million samples—and propose a U-Net-based encoder–decoder architecture with explicit temporal awareness, jointly modeling multi-temporal multispectral imagery and inter-image time-difference cues for end-to-end CHM regression. Contribution/Results: Trained on PrediTree, our model achieves a masked mean squared error of 11.78%, outperforming ResNet-50 by ~12% and a single-RGB baseline by ~30%. This work bridges critical gaps in both high-resolution forest structural dynamics modeling—providing the first large-scale, temporally aligned benchmark—and methodological capability for fine-grained, time-aware CHM prediction.

Technology Category

Application Category

📝 Abstract

We present PrediTree, the first comprehensive open-source dataset designed for training and evaluating tree height prediction models at sub-meter resolution. This dataset combines very high-resolution (0.5m) LiDAR-derived canopy height maps, spatially aligned with multi-temporal and multi-spectral imagery, across diverse forest ecosystems in France, totaling 3,141,568 images. PrediTree addresses a critical gap in forest monitoring capabilities by enabling the training of deep learning methods that can predict tree growth based on multiple past observations. %sout{Initially focused on French forests, PrediTree is designed as an expanding resource with ongoing efforts to incorporate data from other countries. } To make use of this PrediTree dataset, we propose an encoder-decoder framework that requires the multi-temporal multi-spectral imagery and the relative time differences in years between the canopy height map timestamp (target) and each image acquisition date for which this framework predicts the canopy height. The conducted experiments demonstrate that a U-Net architecture trained on the PrediTree dataset provides the highest masked mean squared error of $11.78%$, outperforming the next-best architecture, ResNet-50, by around $12%$, and cutting the error of the same experiments but on fewer bands (red, green, blue only), by around $30%$. This dataset is publicly available on href{URL}{HuggingFace}, and both processing and training codebases are available on href{URL}{GitHub}.

Problem

Research questions and friction points this paper is trying to address.

Predicting tree height from multi-temporal multi-spectral imagery

Addressing limited training data for forest monitoring deep learning

Aligning canopy height maps with multi-spectral images temporally

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses multi-temporal multi-spectral imagery with LiDAR data

Proposes encoder-decoder framework for height prediction

U-Net architecture achieves best performance metrics

🔎 Similar Papers

FoMo: Multi-Modal, Multi-Scale and Multi-Task Remote Sensing Foundation Models for Forest Monitoring