🤖 AI Summary
High cost and low efficiency of individual tree crown segmentation hinder large-scale forest ecological monitoring. Method: We propose BalSAM, the first framework to systematically validate the enhancement of digital surface model (DSM) elevation features for Segment Anything Model (SAM) in multi-biome tree crown instance segmentation. BalSAM introduces an end-to-end fine-tuning architecture enabling deep fusion of RGB imagery and DSM-derived height features. Results: While the original SAM underperforms a customized Mask R-CNN by a large margin in plantation scenes, BalSAM achieves an 18.7% mAP improvement over the baseline SAM and outperforms all comparative methods. It demonstrates superior cross-biome generalization and improved computational efficiency—striking a better balance between accuracy and scalability. This work establishes a new paradigm for low-cost, high-precision, and scalable individual-tree monitoring using UAV-based remote sensing.
📝 Abstract
Information on trees at the individual level is crucial for monitoring forest ecosystems and planning forest management. Current monitoring methods involve ground measurements, requiring extensive cost, time and labor. Advances in drone remote sensing and computer vision offer great potential for mapping individual trees from aerial imagery at broad-scale. Large pre-trained vision models, such as the Segment Anything Model (SAM), represent a particularly compelling choice given limited labeled data. In this work, we compare methods leveraging SAM for the task of automatic tree crown instance segmentation in high resolution drone imagery in three use cases: 1) boreal plantations, 2) temperate forests and 3) tropical forests. We also study the integration of elevation data into models, in the form of Digital Surface Model (DSM) information, which can readily be obtained at no additional cost from RGB drone imagery. We present BalSAM, a model leveraging SAM and DSM information, which shows potential over other methods, particularly in the context of plantations. We find that methods using SAM out-of-the-box do not outperform a custom Mask R-CNN, even with well-designed prompts. However, efficiently tuning SAM end-to-end and integrating DSM information are both promising avenues for tree crown instance segmentation models.