🤖 AI Summary
Agricultural robots frequently suffer from severe occlusion in dense crop environments, leading to visual perception failure and hindering autonomous harvesting performance. To address this, we propose an imitation learning-based active viewpoint planning method that leverages 6-DoF camera motion to acquire unoccluded target observations. Our core contribution is the first application of Action Chunking with Transformer (ACT) to viewpoint planning—eliminating handcrafted reward functions or evaluation metrics and enabling end-to-end learning of continuous motion policies with cross-crop generalization. Integrating RGB-D perception with robot control, the method significantly improves harvesting success rate and efficiency under heavy occlusion in both simulation and real-world field experiments. Crucially, it requires no task-specific reprogramming to adapt across diverse crops, demonstrating robust transferability and practical deployability.
📝 Abstract
In agricultural automation, inherent occlusion presents a major challenge for robotic harvesting. We propose a novel imitation learning-based viewpoint planning approach to actively adjust camera viewpoint and capture unobstructed images of the target crop. Traditional viewpoint planners and existing learning-based methods, depend on manually designed evaluation metrics or reward functions, often struggle to generalize to complex, unseen scenarios. Our method employs the Action Chunking with Transformer (ACT) algorithm to learn effective camera motion policies from expert demonstrations. This enables continuous six-degree-of-freedom (6-DoF) viewpoint adjustments that are smoother, more precise and reveal occluded targets. Extensive experiments in both simulated and real-world environments, featuring agricultural scenarios and a 6-DoF robot arm equipped with an RGB-D camera, demonstrate our method's superior success rate and efficiency, especially in complex occlusion conditions, as well as its ability to generalize across different crops without reprogramming. This study advances robotic harvesting by providing a practical"learn from demonstration"(LfD) solution to occlusion challenges, ultimately enhancing autonomous harvesting performance and productivity.