Self-Consistency in Vision-Language Models for Precision Agriculture: Multi-Response Consensus for Crop Disease Management

📅 2025-07-08

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

To address the low accuracy and poor domain adaptability of vision-language models in precision agriculture for plant disease identification and treatment recommendation, this paper proposes a lightweight vision-language framework integrating prompt engineering and self-consistency mechanisms. Specifically, we design an expert-role prompt template grounded in plant pathology knowledge, implement a cosine-similarity-driven multi-response self-voting mechanism for robust inference, and perform domain-specific fine-tuning and embedding adaptation of PaliGemma for agricultural applications. The framework jointly optimizes disease diagnosis, symptom analysis, and therapeutic recommendation. Evaluated on a maize leaf disease dataset, it achieves 87.8% diagnostic accuracy, and F1 scores of 52.2% and 43.3% for symptom analysis and treatment recommendation, respectively. With significantly reduced parameter count, the model supports real-time deployment on mobile devices, markedly enhancing semantic parsing reliability and decision-making utility in complex field scenarios.

Technology Category

Application Category

📝 Abstract

Precision agriculture relies heavily on accurate image analysis for crop disease identification and treatment recommendation, yet existing vision-language models (VLMs) often underperform in specialized agricultural domains. This work presents a domain-aware framework for agricultural image processing that combines prompt-based expert evaluation with self-consistency mechanisms to enhance VLM reliability in precision agriculture applications. We introduce two key innovations: (1) a prompt-based evaluation protocol that configures a language model as an expert plant pathologist for scalable assessment of image analysis outputs, and (2) a cosine-consistency self-voting mechanism that generates multiple candidate responses from agricultural images and selects the most semantically coherent diagnosis using domain-adapted embeddings. Applied to maize leaf disease identification from field images using a fine-tuned PaliGemma model, our approach improves diagnostic accuracy from 82.2% to 87.8%, symptom analysis from 38.9% to 52.2%, and treatment recommendation from 27.8% to 43.3% compared to standard greedy decoding. The system remains compact enough for deployment on mobile devices, supporting real-time agricultural decision-making in resource-constrained environments. These results demonstrate significant potential for AI-driven precision agriculture tools that can operate reliably in diverse field conditions.

Problem

Research questions and friction points this paper is trying to address.

Improving crop disease identification accuracy in precision agriculture

Enhancing vision-language model reliability for agricultural applications

Enabling real-time decision-making for crop disease management

Innovation

Methods, ideas, or system contributions that make the work stand out.

Prompt-based expert evaluation for scalable assessment

Cosine-consistency self-voting for coherent diagnosis

Fine-tuned PaliGemma model for mobile deployment

🔎 Similar Papers

No similar papers found.