🤖 AI Summary
To address the challenges of high pseudo-label noise and poor robustness to small objects and occlusions in wheat spike segmentation, this paper proposes a two-stage self-training framework based on dynamic pseudo-label refinement. Methodologically: (1) an adaptive confidence thresholding mechanism coupled with spatial consistency verification is introduced to suppress noise propagation; (2) a teacher–student iterative training paradigm is established under strong data augmentation—including multi-scale elastic deformation, color perturbation, and random occlusion; (3) the SegFormer-MiT-B4 architecture is adopted to enhance fine-grained feature representation. Evaluated on the Global Wheat Full Semantic Segmentation Competition, the method achieves a 4.2% improvement in mean Intersection-over-Union (mIoU) over the baseline, demonstrating significantly enhanced robustness and consistency in segmenting wheat spikes under complex field conditions—such as variable illumination, dense overlap, and minute spike structures.
📝 Abstract
This extended abstract details our solution for the Global Wheat Full Semantic Segmentation Competition. We developed a systematic self-training framework. This framework combines a two-stage hybrid training strategy with extensive data augmentation. Our core model is SegFormer with a Mix Transformer (MiT-B4) backbone. We employ an iterative teacher-student loop. This loop progressively refines model accuracy. It also maximizes data utilization. Our method achieved competitive performance. This was evident on both the Development and Testing Phase datasets.