🤖 AI Summary
Addressing the dual challenges of “shadow bias”—caused by shadow interference in weed identification—and scarce labeled data in agricultural fields, this paper proposes a diagnosis-driven semi-supervised learning framework. First, interpretable analysis is employed to diagnose model misclassifications induced by shadows; then, an error-aware pseudo-labeling strategy is introduced to effectively leverage unlabeled data across ResNet, YOLO, and RF-DETR baselines. The approach significantly mitigates shadow bias while improving recall and robustness. On our custom agricultural dataset, it achieves a classification F1-score of 0.90 and detection mAP₅₀ of 0.82. Notably, it maintains strong performance under low-labeling budgets (e.g., <10% annotated samples), demonstrating practical viability for real-world deployment. This work delivers a cost-effective, reliable solution for precision agricultural spraying systems, bridging the gap between interpretability, data efficiency, and field applicability.
📝 Abstract
The automated management of invasive weeds is critical for sustainable agriculture, yet the performance of deep learning models in real-world fields is often compromised by two factors: challenging environmental conditions and the high cost of data annotation. This study tackles both issues through a diagnostic-driven, semi-supervised framework. Using a unique dataset of approximately 975 labeled and 10,000 unlabeled images of Guinea Grass in sugarcane, we first establish strong supervised baselines for classification (ResNet) and detection (YOLO, RF-DETR), achieving F1 scores up to 0.90 and mAP50 scores exceeding 0.82. Crucially, this foundational analysis, aided by interpretability tools, uncovered a pervasive "shadow bias," where models learned to misidentify shadows as vegetation. This diagnostic insight motivated our primary contribution: a semi-supervised pipeline that leverages unlabeled data to enhance model robustness. By training models on a more diverse set of visual information through pseudo-labeling, this framework not only helps mitigate the shadow bias but also provides a tangible boost in recall, a critical metric for minimizing weed escapes in automated spraying systems. To validate our methodology, we demonstrate its effectiveness in a low-data regime on a public crop-weed benchmark. Our work provides a clear and field-tested framework for developing, diagnosing, and improving robust computer vision systems for the complex realities of precision agriculture.