🤖 AI Summary
To address three key challenges in self-supervised learning for plant disease detection—modeling continuous disease progression along leaf veins, capturing long-range spatial dependencies, and mitigating high computational overhead at high resolutions—this work pioneers the integration of the linear-complexity Vision Mamba state-space model into agricultural image self-supervision. We propose a synergistic framework comprising a Mamba-based encoder, prototype-driven teacher-student contrastive learning, and multi-view feature alignment, explicitly modeling the spatially oriented continuity of lesions. Evaluated on three public plant disease datasets, our method consistently outperforms CNN- and ViT-based baselines. Qualitative analysis reveals that the learned representations are compact and exhibit highly focused activation on pathological regions, significantly enhancing both the representational quality of unlabeled leaf images and downstream detection performance.
📝 Abstract
Self-supervised learning (SSL) is attractive for plant disease detection as it can exploit large collections of unlabeled leaf images, yet most existing SSL methods are built on CNNs or vision transformers that are poorly matched to agricultural imagery. CNN-based SSL struggles to capture disease patterns that evolve continuously along leaf structures, while transformer-based SSL introduces quadratic attention cost from high-resolution patches. To address these limitations, we propose StateSpace-SSL, a linear-time SSL framework that employs a Vision Mamba state-space encoder to model long-range lesion continuity through directional scanning across the leaf surface. A prototype-driven teacher-student objective aligns representations across multiple views, encouraging stable and lesion-aware features from labelled data. Experiments on three publicly available plant disease datasets show that StateSpace-SSL consistently outperforms the CNN- and transformer-based SSL baselines in various evaluation metrics. Qualitative analyses further confirm that it learns compact, lesion-focused feature maps, highlighting the advantage of linear state-space modelling for self-supervised plant disease representation learning.