🤖 AI Summary
Plant specimen images exhibit highly heterogeneous backgrounds, which severely degrade the performance of deep learning–based classification models. To address this, we propose a two-stage, object-detection–driven segmentation pipeline: first, a fine-tuned YOLOv10 localizes plant regions and generates bounding box prompts; second, these prompts guide a fine-tuned SAM2 model to achieve high-precision plant region segmentation. This approach effectively suppresses background noise and enhances the robustness of subsequent multi-feature classification. Experiments demonstrate state-of-the-art segmentation performance (IoU = 0.94, Dice = 0.97) and substantial classification gains across five plant trait recognition tasks—achieving up to 4.36% absolute accuracy improvement and 4.15% F1-score gain. Our key contribution is the first integration of YOLOv10 and SAM2 for prompt-guided, fine-grained segmentation of herbarium specimens, establishing a transferable, end-to-end paradigm for biological image analysis.
📝 Abstract
Deep learning-based classification of herbarium images is hampered by background heterogeneity, which introduces noise and artifacts that can potentially mislead models and reduce classification accuracy. Addressing these background-related challenges is critical to improving model performance. We introduce PlantSAM, an automated segmentation pipeline that integrates YOLOv10 for plant region detection and the Segment Anything Model (SAM2) for segmentation. YOLOv10 generates bounding box prompts to guide SAM2, enhancing segmentation accuracy. Both models were fine-tuned on herbarium images and evaluated using Intersection over Union (IoU) and Dice coefficient metrics. PlantSAM achieved state-of-the-art segmentation performance, with an IoU of 0.94 and a Dice coefficient of 0.97. Incorporating segmented images into classification models led to consistent performance improvements across five tested botanical traits, with accuracy gains of up to 4.36% and F1-score improvements of 4.15%. Our findings highlight the importance of background removal in herbarium image analysis, as it significantly enhances classification accuracy by allowing models to focus more effectively on the foreground plant structures.