Zero-shot Hierarchical Plant Segmentation via Foundation Segmentation Models and Text-to-image Attention

📅 2025-09-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Zero-shot hierarchical segmentation of rosette plants (with overlapping leaves) in top-view imagery remains challenging—existing foundation models can extract individual leaves but fail to parse whole-plant structure without training. Method: We propose ZeroPlantSeg, the first framework to integrate text-to-image attention into plant segmentation, synergistically coupling a foundation segmentation model (for leaf-level instance extraction) with a vision-language model (for text-prompted reasoning of plant topology). It achieves fully zero-shot, annotation-free, and fine-tuning-free whole-plant segmentation. Contribution/Results: Its core innovation lies in leveraging cross-modal semantic priors to model leaf-to-plant attribution relationships, enabling zero-shot generalization across species, growth stages, and imaging conditions. Experiments demonstrate that ZeroPlantSeg significantly outperforms existing zero-shot methods across multiple real-world scenarios and surpasses supervised baselines under cross-domain evaluation.

Technology Category

Application Category

📝 Abstract
Foundation segmentation models achieve reasonable leaf instance extraction from top-view crop images without training (i.e., zero-shot). However, segmenting entire plant individuals with each consisting of multiple overlapping leaves remains challenging. This problem is referred to as a hierarchical segmentation task, typically requiring annotated training datasets, which are often species-specific and require notable human labor. To address this, we introduce ZeroPlantSeg, a zero-shot segmentation for rosette-shaped plant individuals from top-view images. We integrate a foundation segmentation model, extracting leaf instances, and a vision-language model, reasoning about plants' structures to extract plant individuals without additional training. Evaluations on datasets with multiple plant species, growth stages, and shooting environments demonstrate that our method surpasses existing zero-shot methods and achieves better cross-domain performance than supervised methods. Implementations are available at https://github.com/JunhaoXing/ZeroPlantSeg.
Problem

Research questions and friction points this paper is trying to address.

Segmenting entire plant individuals with overlapping leaves
Eliminating need for annotated species-specific training datasets
Achieving zero-shot hierarchical segmentation across diverse conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Foundation segmentation model for leaf extraction
Vision-language model for plant structure reasoning
Zero-shot hierarchical segmentation without training
🔎 Similar Papers
No similar papers found.