🤖 AI Summary
To address the high cost and limited spatial coverage of conventional field surveys, this study proposes a multimodal forest biodiversity assessment framework integrating high-resolution 2D orthoimagery with 3D airborne LiDAR (ALS) point clouds. Our contributions are threefold: (1) We introduce BioVista—the first benchmark dataset featuring paired orthoimages and ALS point clouds annotated for forest biodiversity; (2) We design a feature-level concatenation fusion strategy, enabling complementary modeling of spectral and 3D structural information for the first time; (3) We employ ResNet and PointVector to extract image and point cloud features, respectively, and incorporate confidence-weighted ensemble learning. Experimental results demonstrate that the fused model achieves 75.5% classification accuracy—significantly outperforming the best unimodal baseline (72.8%). This approach offers a scalable, low-labor technical solution for large-scale biodiversity monitoring, conservation planning, and ecosystem management.
📝 Abstract
Accurate assessment of forest biodiversity is crucial for ecosystem management and conservation. While traditional field surveys provide high-quality assessments, they are labor-intensive and spatially limited. This study investigates whether deep learning-based fusion of close-range sensing data from 2D orthophotos (12.5 cm resolution) and 3D airborne laser scanning (ALS) point clouds (8 points/m^2) can enhance biodiversity assessment. We introduce the BioVista dataset, comprising 44.378 paired samples of orthophotos and ALS point clouds from temperate forests in Denmark, designed to explore multi-modal fusion approaches for biodiversity potential classification. Using deep neural networks (ResNet for orthophotos and PointVector for ALS point clouds), we investigate each data modality's ability to assess forest biodiversity potential, achieving mean accuracies of 69.4% and 72.8%, respectively. We explore two fusion approaches: a confidence-based ensemble method and a feature-level concatenation strategy, with the latter achieving a mean accuracy of 75.5%. Our results demonstrate that spectral information from orthophotos and structural information from ALS point clouds effectively complement each other in forest biodiversity assessment.