🤖 AI Summary
Existing self-supervised methods struggle to effectively model multi-scale and hierarchical lesion patterns in plant disease images. To address this, we propose PSMamba, a progressive self-supervised framework. Its core innovation is a novel “shared global teacher with dual specialized students” architecture: one student models mid-scale features (e.g., lesion distribution and vein structure), while the other captures local-scale features (e.g., textural anomalies and early-stage lesions). PSMamba integrates Vision Mamba’s long-range sequence modeling capability, hierarchical knowledge distillation between the two students, multi-view contrastive learning, and a cross-scale consistency loss. Evaluated on three major benchmark datasets, PSMamba significantly outperforms state-of-the-art self-supervised methods—achieving superior accuracy and robustness under domain shift and in fine-grained disease classification tasks.
📝 Abstract
Self-supervised Learning (SSL) has become a powerful paradigm for representation learning without manual annotations. However, most existing frameworks focus on global alignment and struggle to capture the hierarchical, multi-scale lesion patterns characteristic of plant disease imagery. To address this gap, we propose PSMamba, a progressive self-supervised framework that integrates the efficient sequence modelling of Vision Mamba (VM) with a dual-student hierarchical distillation strategy. Unlike conventional single teacher-student designs, PSMamba employs a shared global teacher and two specialised students: one processes mid-scale views to capture lesion distributions and vein structures, while the other focuses on local views to capture fine-grained cues such as texture irregularities and early-stage lesions. This multi-granular supervision facilitates the joint learning of contextual and detailed representations, with consistency losses ensuring coherent cross-scale alignment. Experiments on three benchmark datasets show that PSMamba consistently outperforms state-of-the-art SSL methods, delivering superior accuracy and robustness in both domain-shifted and fine-grained scenarios.