๐ค AI Summary
To address the urgent need for early detection of rice leaf diseases in Bangladesh, this study systematically compares the fine-grained classification performance of ResNet50, Vision Transformer (ViT), and SVM on the locally curated, small-scale Dhan-Shomadhan datasetโmarking the first such benchmark. We evaluate the efficacy of transfer learning and data augmentation for agricultural image classification under limited-data conditions. Experimental results show ResNet50 achieves 98.2% accuracy, substantially outperforming ViT (95.7%) and SVM (89.3%), while offering a favorable trade-off between computational efficiency and accuracy for resource-constrained deployment. Key contributions include: (1) establishing the first standardized benchmark framework for rice disease recognition in Bangladesh; (2) empirically demonstrating the superiority of transfer learning in low-resource agricultural vision tasks; and (3) delivering a robust, end-to-end, deployable disease identification solution tailored for developing countries.
๐ Abstract
In nations such as Bangladesh, agriculture plays a vital role in providing livelihoods for a significant portion of the population. Identifying and classifying plant diseases early is critical to prevent their spread and minimize their impact on crop yield and quality. Various computer vision techniques can be used for such detection and classification. While CNNs have been dominant on such image classification tasks, vision transformers has become equally good in recent time also. In this paper we study the various computer vision techniques for Bangladeshi rice leaf disease detection. We use the Dhan-Shomadhan -- a Bangladeshi rice leaf disease dataset, to experiment with various CNN and ViT models. We also compared the performance of such deep neural network architecture with traditional machine learning architecture like Support Vector Machine(SVM). We leveraged transfer learning for better generalization with lower amount of training data. Among the models tested, ResNet50 exhibited the best performance over other CNN and transformer-based models making it the optimal choice for this task.