AppleGrowthVision: A large-scale stereo dataset for phenological analysis, fruit detection, and 3D reconstruction in apple orchards

📅 2025-05-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing apple orchard monitoring datasets suffer from insufficient scene diversity, labor-intensive annotation, inadequate coverage of phenological growth stages, and lack of stereo imagery—limiting progress in fruit localization, yield estimation, and 3D reconstruction. To address these gaps, we introduce the first large-scale, binocular stereo image dataset spanning the complete apple growth cycle, systematically aligned with the BBCH phenological scale and enriched with dense pixel-level annotations and agronomically validated labels. We establish a standardized benchmark integrating agricultural science and computer vision, bridging critical gaps in growth modeling and 3D perception. Evaluation on this dataset demonstrates substantial improvements: YOLOv8 and Faster R-CNN achieve F1-score gains of 7.69% and 31.06%, respectively, in fruit detection; six-stage phenological classification accuracy exceeds 95%; and high-precision fruit localization and orchard-scale 3D reconstruction are enabled.

Technology Category

Application Category

📝 Abstract
Deep learning has transformed computer vision for precision agriculture, yet apple orchard monitoring remains limited by dataset constraints. The lack of diverse, realistic datasets and the difficulty of annotating dense, heterogeneous scenes. Existing datasets overlook different growth stages and stereo imagery, both essential for realistic 3D modeling of orchards and tasks like fruit localization, yield estimation, and structural analysis. To address these gaps, we present AppleGrowthVision, a large-scale dataset comprising two subsets. The first includes 9,317 high resolution stereo images collected from a farm in Brandenburg (Germany), covering six agriculturally validated growth stages over a full growth cycle. The second subset consists of 1,125 densely annotated images from the same farm in Brandenburg and one in Pillnitz (Germany), containing a total of 31,084 apple labels. AppleGrowthVision provides stereo-image data with agriculturally validated growth stages, enabling precise phenological analysis and 3D reconstructions. Extending MinneApple with our data improves YOLOv8 performance by 7.69 % in terms of F1-score, while adding it to MinneApple and MAD boosts Faster R-CNN F1-score by 31.06 %. Additionally, six BBCH stages were predicted with over 95 % accuracy using VGG16, ResNet152, DenseNet201, and MobileNetv2. AppleGrowthVision bridges the gap between agricultural science and computer vision, by enabling the development of robust models for fruit detection, growth modeling, and 3D analysis in precision agriculture. Future work includes improving annotation, enhancing 3D reconstruction, and extending multimodal analysis across all growth stages.
Problem

Research questions and friction points this paper is trying to address.

Lack of diverse stereo datasets for apple orchard monitoring
Difficulty annotating dense, heterogeneous orchard scenes
Existing datasets miss growth stages and 3D reconstruction needs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large-scale stereo dataset for apple orchards
High-resolution images with growth stages
Improves fruit detection and 3D reconstruction
🔎 Similar Papers
No similar papers found.
L
Laura-Sophia von Hirschhausen
Fraunhofer HHI
J
Jannes Magnusson
Fraunhofer HHI
M
Mykyta Kovalenko
Fraunhofer HHI
F
Fredrik Boye
Fraunhofer IVI
T
Tanay Rawat
Fraunhofer IVI
Peter Eisert
Peter Eisert
Professor Visual Computing, Humboldt University Berlin, Fraunhofer HHI
3d video analysis and synthesisvisiongraphics
A
A. Hilsmann
Fraunhofer HHI
S
Sebastian Pretzsch
Fraunhofer IVI
Sebastian Bosse
Sebastian Bosse
Head of Interactive & Cognitive Systems, Fraunhofer HHI, Germany
computer visionhuman-computer interactionhybrid modelsmachine learningcognition modelling