🤖 AI Summary
Strawberry 6D pose estimation for agricultural harvesting robots relies heavily on scarce and costly real-world annotated data, exacerbating challenges posed by labor shortages and high annotation expenses.
Method: This paper proposes a lightweight, purely synthetic-data-driven solution: (i) a procedural synthetic data generation pipeline built in Blender to enhance image photorealism and diversity; and (ii) YOLOX-6D-Pose—a single-stage framework trained exclusively on synthetic data—enabling high-accuracy 6D pose estimation without any real-image supervision.
Contribution/Results: To our knowledge, this is the first work to empirically validate the effectiveness of purely synthetic data for strawberry 6D pose estimation on resource-constrained edge devices (Jetson Orin Nano). The model achieves comparable ADD-S scores on both RTX 3090 and Orin Nano, sustains real-time inference (>15 FPS), and accurately estimates poses of mature strawberries—demonstrating strong feasibility for field deployment.
📝 Abstract
Automated and selective harvesting of fruits has become an important area of research, particularly due to challenges such as high costs and a shortage of seasonal labor in advanced economies. This paper focuses on 6D pose estimation of strawberries using purely synthetic data generated through a procedural pipeline for photorealistic rendering. We employ the YOLOX-6D-Pose algorithm, a single-shot approach that leverages the YOLOX backbone, known for its balance between speed and accuracy, and its support for edge inference. To address the lacking availability of training data, we introduce a robust and flexible pipeline for generating synthetic strawberry data from various 3D models via a procedural Blender pipeline, where we focus on enhancing the realism of the synthesized data in comparison to previous work to make it a valuable resource for training pose estimation algorithms. Quantitative evaluations indicate that our models achieve comparable accuracy on both the NVIDIA RTX 3090 and Jetson Orin Nano across several ADD-S metrics, with the RTX 3090 demonstrating superior processing speed. However, the Jetson Orin Nano is particularly suited for resource-constrained environments, making it an excellent choice for deployment in agricultural robotics. Qualitative assessments further confirm the model's performance, demonstrating its capability to accurately infer the poses of ripe and partially ripe strawberries, while facing challenges in detecting unripe specimens. This suggests opportunities for future improvements, especially in enhancing detection capabilities for unripe strawberries (if desired) by exploring variations in color. Furthermore, the methodology presented could be adapted easily for other fruits such as apples, peaches, and plums, thereby expanding its applicability and impact in the field of agricultural automation.