A Multimodal Benchmark Dataset and Model for Crop Disease Diagnosis

📅 2025-03-10
🏛️ European Conference on Computer Vision
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address challenges in crop disease diagnosis—including difficulty in fusing heterogeneous multimodal data, poor model interpretability, and limited cross-domain generalization—this paper introduces the first multimodal agricultural benchmark dataset integrating images, textual disease descriptions, and a disease-specific knowledge graph, covering 12 staple crops and 86 diseases. We propose a Vision–Language–Knowledge Collaborative Diagnosis Framework that unifies annotation across all three modalities, injects domain knowledge via ViT-BERT and Graph Neural Networks (GNNs), aligns modalities through contrastive learning, and enables fine-grained interpretable classification. Evaluated on our benchmark, the framework achieves a mean accuracy of 92.7% and improves cross-domain generalization by 14.3% over state-of-the-art unimodal and bimodal methods. This work establishes a reusable multimodal paradigm and technical foundation for intelligent agricultural extension services.

Technology Category

Application Category

Problem

Research questions and friction points this paper is trying to address.

Develops a multimodal dataset for crop disease diagnosis.
Enhances AI models for agricultural decision-making using image-text data.
Introduces a novel finetuning strategy for improved disease identification.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal dataset for crop disease diagnosis
Low-rank adaptation finetuning strategy
Integration of visual and textual data
🔎 Similar Papers
No similar papers found.
X
Xiang Liu
AI Innovation Center, China Unicom, Beijing 100013, China; Unicom Digital Technology, China Unicom, Beijing 100013, China
Zhaoxiang Liu
Zhaoxiang Liu
China Unicom
Computer VisionDeep LearningRoboticsHuman-Computer Interaction
Huan Hu
Huan Hu
PhD student, Washington State University
analog& mixed signals IC design
Z
Zezhou Chen
AI Innovation Center, China Unicom, Beijing 100013, China; Unicom Digital Technology, China Unicom, Beijing 100013, China
K
Kohou Wang
AI Innovation Center, China Unicom, Beijing 100013, China; Unicom Digital Technology, China Unicom, Beijing 100013, China
K
Kai Wang
AI Innovation Center, China Unicom, Beijing 100013, China; Unicom Digital Technology, China Unicom, Beijing 100013, China
Shiguo Lian
Shiguo Lian
CloudMinds