🤖 AI Summary
Generic vision foundation models exhibit poor generalization, high annotation dependency, and weak cross-domain adaptability for fine-grained plant species identification and herbicide-induced injury assessment in agricultural field trials. To address these limitations, this work proposes an agriculture-specific vision foundation model. It leverages self-supervised pretraining and domain adaptation on large-scale agricultural imagery to optimize feature representations, and integrates image segmentation with transfer learning to enhance fine-grained discriminative capability. The model significantly improves generalization across unseen conditions—including cross-regional, multi-temporal, and UAV-captured imagery: species identification F1-score reaches 0.94 (+3 percentage points), herbicide injury classification F1-score reaches 0.33 (+7 percentage points), and performance on unseen scenarios improves by over 10 percentage points. Remarkably, it achieves superior accuracy using only 20% of labeled data compared to generic models, attaining a species F1-score of 0.60 on UAV imagery.
📝 Abstract
Herbicide field trials require accurate identification of plant species and assessment of herbicide-induced damage across diverse environments. While general-purpose vision foundation models have shown promising results in complex visual domains, their performance can be limited in agriculture, where fine-grained distinctions between species and damage types are critical. In this work, we adapt a general-purpose vision foundation model to herbicide trial characterization. Trained using a self-supervised learning approach on a large, curated agricultural dataset, the model learns rich and transferable representations optimized for herbicide trials images. Our domain-specific model significantly outperforms the best general-purpose foundation model in both species identification (F1 score improvement from 0.91 to 0.94) and damage classification (from 0.26 to 0.33). Under unseen conditions (new locations and other time), it achieves even greater gains (species identification from 0.56 to 0.66; damage classification from 0.17 to 0.27). In domain-shift scenarios, such as drone imagery, it maintains strong performance (species classification from 0.49 to 0.60). Additionally, we show that domain-specific pretraining enhances segmentation accuracy, particularly in low-annotation regimes. An annotation-efficiency analysis reveals that, under unseen conditions, the domain-specific model achieves 5.4% higher F1 score than the general-purpose model, while using 80% fewer labeled samples. These results demonstrate the generalization capabilities of domain-specific foundation models and their potential to significantly reduce manual annotation efforts, offering a scalable and automated solution for herbicide trial analysis.