Self-Consistency in Vision-Language Models for Precision Agriculture: Multi-Response Consensus for Crop Disease Management

📅 2025-07-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the low accuracy and poor domain adaptability of vision-language models in precision agriculture for plant disease identification and treatment recommendation, this paper proposes a lightweight vision-language framework integrating prompt engineering and self-consistency mechanisms. Specifically, we design an expert-role prompt template grounded in plant pathology knowledge, implement a cosine-similarity-driven multi-response self-voting mechanism for robust inference, and perform domain-specific fine-tuning and embedding adaptation of PaliGemma for agricultural applications. The framework jointly optimizes disease diagnosis, symptom analysis, and therapeutic recommendation. Evaluated on a maize leaf disease dataset, it achieves 87.8% diagnostic accuracy, and F1 scores of 52.2% and 43.3% for symptom analysis and treatment recommendation, respectively. With significantly reduced parameter count, the model supports real-time deployment on mobile devices, markedly enhancing semantic parsing reliability and decision-making utility in complex field scenarios.

Technology Category

Application Category

📝 Abstract
Precision agriculture relies heavily on accurate image analysis for crop disease identification and treatment recommendation, yet existing vision-language models (VLMs) often underperform in specialized agricultural domains. This work presents a domain-aware framework for agricultural image processing that combines prompt-based expert evaluation with self-consistency mechanisms to enhance VLM reliability in precision agriculture applications. We introduce two key innovations: (1) a prompt-based evaluation protocol that configures a language model as an expert plant pathologist for scalable assessment of image analysis outputs, and (2) a cosine-consistency self-voting mechanism that generates multiple candidate responses from agricultural images and selects the most semantically coherent diagnosis using domain-adapted embeddings. Applied to maize leaf disease identification from field images using a fine-tuned PaliGemma model, our approach improves diagnostic accuracy from 82.2% to 87.8%, symptom analysis from 38.9% to 52.2%, and treatment recommendation from 27.8% to 43.3% compared to standard greedy decoding. The system remains compact enough for deployment on mobile devices, supporting real-time agricultural decision-making in resource-constrained environments. These results demonstrate significant potential for AI-driven precision agriculture tools that can operate reliably in diverse field conditions.
Problem

Research questions and friction points this paper is trying to address.

Improving crop disease identification accuracy in precision agriculture
Enhancing vision-language model reliability for agricultural applications
Enabling real-time decision-making for crop disease management
Innovation

Methods, ideas, or system contributions that make the work stand out.

Prompt-based expert evaluation for scalable assessment
Cosine-consistency self-voting for coherent diagnosis
Fine-tuned PaliGemma model for mobile deployment
🔎 Similar Papers
No similar papers found.
M
Mihir Gupta
The Harker School, USA
A
Abhay Mangla
Dougherty Valley High School, USA
P
Pratik Desai
Kissan.ai, USA
Ross Greer
Ross Greer
University of California Merced
Artificial IntelligenceMachine VisionAutonomous DrivingHuman-Robot InteractionComputer Music