π€ AI Summary
Freshness prediction for fruits and vegetables faces dual challenges of scarce expert annotations and high data privacy sensitivity. Method: We propose a model-agnostic ordinal meta-learning framework that tightly integrates vision-language models (VLMs), ordinal regression, and meta-learning, augmented by a knowledge distillation strategy from proprietary VLMs. Our approach requires no access to raw sensitive data and achieves fine-grained quality assessment using only a minimal number of ordinal-labeled samples (e.g., βfresh β slightly spoiled β severely spoiledβ). Contribution/Results: It is the first work to embed ordinal structural priors into the meta-learning paradigm, jointly preserving label semantic order and enabling cross-task generalization. Under zero-shot and few-shot settings, our method achieves a mean accuracy of 92.71%, significantly outperforming existing open-source VLM-based approaches. This provides a scalable, privacy-preserving paradigm for agricultural visual perception under low-data and high-privacy constraints.
π Abstract
To effectively manage the wastage of perishable fruits, it is crucial to accurately predict their freshness or shelf life using non-invasive methods that rely on visual data. In this regard, deep learning techniques can offer a viable solution. However, obtaining fine-grained fruit freshness labels from experts is costly, leading to a scarcity of data. Closed proprietary Vision Language Models (VLMs), such as Gemini, have demonstrated strong performance in fruit freshness detection task in both zero-shot and few-shot settings. Nonetheless, food retail organizations are unable to utilize these proprietary models due to concerns related to data privacy, while existing open-source VLMs yield sub-optimal performance for the task. Fine-tuning these open-source models with limited data fails to achieve the performance levels of proprietary models. In this work, we introduce a Model-Agnostic Ordinal Meta-Learning (MAOML) algorithm, designed to train smaller VLMs. This approach utilizes meta-learning to address data sparsity and leverages label ordinality, thereby achieving state-of-the-art performance in the fruit freshness classification task under both zero-shot and few-shot settings. Our method achieves an industry-standard accuracy of 92.71%, averaged across all fruits.
Keywords: Fruit Quality Prediction, Vision Language Models, Meta Learning, Ordinal Regression