An Explainable Vision Transformer with Transfer Learning Combined with Support Vector Machine Based Efficient Drought Stress Identification

📅 2024-07-31
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF

career value

183K/year
🤖 AI Summary
Early detection of drought stress is critical for minimizing crop losses, yet subtle phenotypic changes necessitate non-invasive aerial imaging and advanced modeling. This paper proposes an interpretable Vision Transformer (ViT)-driven framework tailored for potato crops. We introduce two novel architectures: a ViT-SVM hybrid model and an end-to-end ViT classifier—the first integration of ViT with SVM for agricultural stress recognition. Leveraging transfer learning and multispectral/RGB drone imagery, our method localizes key stress indicators—including leaf wilting and canopy texture degradation—via attention maps. Experimental results demonstrate significant improvements in detection accuracy and provide full interpretability of model decisions through visualized attention mechanisms. The framework enables real-time, trustworthy drought monitoring and management in field conditions. (136 words)

Technology Category

Application Category

📝 Abstract
Early detection of drought stress is critical for taking timely measures for reducing crop loss before the drought impact becomes irreversible. The subtle phenotypical and physiological changes in response to drought stress are captured by non-invasive imaging techniques and these imaging data serve as valuable resource for machine learning methods to identify drought stress. While convolutional neural networks (CNNs) are in wide use, vision transformers (ViTs) present a promising alternative in capturing long-range dependencies and intricate spatial relationships, thereby enhancing the detection of subtle indicators of drought stress. We propose an explainable deep learning pipeline that leverages the power of ViTs for drought stress detection in potato crops using aerial imagery. We applied two distinct approaches: a synergistic combination of ViT and support vector machine (SVM), where ViT extracts intricate spatial features from aerial images, and SVM classifies the crops as stressed or healthy and an end-to-end approach using a dedicated classification layer within ViT to directly detect drought stress. Our key findings explain the ViT model's decision-making process by visualizing attention maps. These maps highlight the specific spatial features within the aerial images that the ViT model focuses as the drought stress signature. Our findings demonstrate that the proposed methods not only achieve high accuracy in drought stress identification but also shedding light on the diverse subtle plant features associated with drought stress. This offers a robust and interpretable solution for drought stress monitoring for farmers to undertake informed decisions for improved crop management.
Problem

Research questions and friction points this paper is trying to address.

Early detection of drought stress in crops
Using vision transformers for drought stress identification
Explainable deep learning for agricultural monitoring
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision Transformer with SVM for drought detection
Explainable deep learning with attention maps
Aerial imagery analysis for crop stress
🔎 Similar Papers
No similar papers found.