Foundation Model-Based Apple Ripeness and Size Estimation for Selective Harvesting

📅 2025-02-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current apple harvesting robots lack precise maturity and size estimation capabilities, leading to non-selective harvesting. To address this, we propose an RGB-D multimodal joint estimation framework for selective harvesting. Our method innovatively integrates Grounding-DINO for zero-shot apple detection and binary ripeness classification (ripe vs. unripe), and systematically evaluates six size estimation algorithms within a unified end-to-end assessment pipeline. We introduce the first public Fuji apple dataset featuring fine-grained ripeness grading, temporal annotations, and ground-truth 3D size measurements—comprising 4,027 images and 16,257 annotated instances. Experiments demonstrate state-of-the-art performance: superior detection robustness, highest ripeness classification accuracy, and optimal size estimation (lowest mean error and variance). Both source code and the dataset are fully open-sourced.

Technology Category

Application Category

📝 Abstract
Harvesting is a critical task in the tree fruit industry, demanding extensive manual labor and substantial costs, and exposing workers to potential hazards. Recent advances in automated harvesting offer a promising solution by enabling efficient, cost-effective, and ergonomic fruit picking within tight harvesting windows. However, existing harvesting technologies often indiscriminately harvest all visible and accessible fruits, including those that are unripe or undersized. This study introduces a novel foundation model-based framework for efficient apple ripeness and size estimation. Specifically, we curated two public RGBD-based Fuji apple image datasets, integrating expanded annotations for ripeness ("Ripe"vs."Unripe") based on fruit color and image capture dates. The resulting comprehensive dataset, Fuji-Ripeness-Size Dataset, includes 4,027 images and 16,257 annotated apples with ripeness and size labels. Using Grounding-DINO, a language-model-based object detector, we achieved robust apple detection and ripeness classification, outperforming other state-of-the-art models. Additionally, we developed and evaluated six size estimation algorithms, selecting the one with the lowest error and variation for optimal performance. The Fuji-Ripeness-Size Dataset and the apple detection and size estimation algorithms are made publicly available, which provides valuable benchmarks for future studies in automated and selective harvesting.
Problem

Research questions and friction points this paper is trying to address.

Automated apple ripeness estimation
Apple size detection for harvesting
Selective harvesting framework development
Innovation

Methods, ideas, or system contributions that make the work stand out.

Foundation model-based ripeness estimation
RGBD image dataset integration
Language-model object detection enhancement
🔎 Similar Papers
No similar papers found.
K
Keyi Zhu
Department of Mechanical Engineering, Michigan State University, East Lansing, MI, USA
J
Jiajia Li
Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI, USA
Kaixiang Zhang
Kaixiang Zhang
Zhengzhou University
Bioanalytical chemistryDNA based nanomaterial
C
Chaaran Arunachalam
Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI, USA
Siddhartha Bhattacharya
Siddhartha Bhattacharya
Michigan State University
machine learningmetanetworkscomputer vision
Renfu Lu
Renfu Lu
USDA/ARS
Sensingimagingautomationharvestpostharvest quality of fruit and vegetables
Zhaojian Li
Zhaojian Li
Red Cedar Distinguished Associate Professor, Michigan State University
ControlsLearningRoboticsConnected VehiclesSmart Agriculture