🤖 AI Summary
Addressing the challenge of autonomous grasping of deformable leaves in agricultural environments, this paper proposes a geometric-prior-guided self-supervised end-to-end manipulation framework. Methodologically, it introduces a confidence-weighted fusion mechanism wherein a geometric module acts as an “expert teacher” to generate high-quality pseudo-labels, guiding collaborative learning of neural modules such as GraspPointCNN; it further integrates YOLOv8-based instance segmentation with RAFT-Stereo-based binocular depth estimation for robust perception. The key contribution lies in the first dynamic coupling of geometric reasoning and neural learning within a self-supervised training paradigm—eliminating the need for manual annotations. In greenhouse experiments, the approach achieves an 84.7% grasp success rate, outperforming pure geometric methods by 9.4 percentage points and pure neural methods by 24.5 percentage points. This demonstrates substantial improvements in reliability and generalizability for manipulating deformable objects in complex agricultural settings.
📝 Abstract
Automating leaf manipulation in agricultural settings faces significant challenges, including the variability of plant morphologies and deformable leaves. We propose a novel hybrid geometric-neural approach for autonomous leaf grasping that combines traditional computer vision with neural networks through self-supervised learning. Our method integrates YOLOv8 for instance segmentation and RAFT-Stereo for 3D depth estimation to build rich leaf representations, which feed into both a geometric feature scoring pipeline and a neural refinement module (GraspPointCNN). The key innovation is our confidence-weighted fusion mechanism that dynamically balances the contribution of each approach based on prediction certainty. Our self-supervised framework uses the geometric pipeline as an expert teacher to automatically generate training data. Experiments demonstrate that our approach achieves an 88.0% success rate in controlled environments and 84.7% in real greenhouse conditions, significantly outperforming both purely geometric (75.3%) and neural (60.2%) methods. This work establishes a new paradigm for agricultural robotics where domain expertise is seamlessly integrated with machine learning capabilities, providing a foundation for fully automated crop monitoring systems.