Vision-guided Autonomous Dual-arm Extraction Robot for Bell Pepper Harvesting

📅 2026-03-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of automating sweet pepper harvesting in outdoor unstructured environments, where occlusion and complex backgrounds hinder reliable perception and manipulation. To this end, the authors propose VADER, a dual-arm mobile harvesting robot that integrates hierarchical visual perception—from scene-level detection to fruit-level pose estimation—coordinated dual-arm motion planning, and a teleoperation fallback mechanism based on the GELLO framework. The system achieves the first demonstration of autonomous, coordinated dual-arm harvesting in real-world agricultural fields. A cross-domain sweet pepper dataset comprising over 3,200 images was curated to enable end-to-end training. Experimental results show a harvesting success rate exceeding 60% under outdoor conditions, with a per-fruit cycle time under 100 seconds. The dataset and code have been publicly released to advance research in agricultural robotics.

Technology Category

Application Category

📝 Abstract
Agricultural robotics has emerged as a critical solution to the labor shortages and rising costs associated with manual crop harvesting. Bell pepper harvesting, in particular, is a labor-intensive task, accounting for up to 50% of total production costs. While automated solutions have shown promise in controlled greenhouse environments, harvesting in unstructured outdoor farms remains an open challenge due to environmental variability and occlusion. This paper presents VADER (Vision-guided Autonomous Dual-arm Extraction Robot), a dual-arm mobile manipulation system designed specifically for the autonomous harvesting of bell peppers in outdoor environments. The system integrates a robust perception pipeline coupled with a dual-arm planning framework that coordinates a gripping arm and a cutting arm for extraction. We validate the system through trials in various realistic conditions, demonstrating a harvest success rate exceeding 60% with a cycle time of under 100 seconds per fruit, while also featuring a teleoperation fail-safe based on the GELLO teleoperation framework to ensure robustness. To support robust perception, we contribute a hierarchically structured dataset of over 3,200 images spanning indoor and outdoor domains, pairing wide-field scene images with close-up pepper images to enable a coarse-to-fine training strategy from fruit detection to high-precision pose estimation. The code and dataset will be made publicly available upon acceptance.
Problem

Research questions and friction points this paper is trying to address.

autonomous harvesting
bell pepper
outdoor farming
occlusion
environmental variability
Innovation

Methods, ideas, or system contributions that make the work stand out.

dual-arm manipulation
vision-guided harvesting
outdoor agricultural robotics
hierarchical perception dataset
teleoperation fail-safe
🔎 Similar Papers
No similar papers found.
Kshitij Madhav Bhat
Kshitij Madhav Bhat
Carnegie Mellon University
Robotics
Tom Gao
Tom Gao
Student, University of Michigan
Robotics
A
Abhishek Mathur
Robotics Institute, Carnegie Mellon University, Pittsburgh, PA
R
Rohit Satishkumar
Robotics Institute, Carnegie Mellon University, Pittsburgh, PA
Francisco Yandun
Francisco Yandun
Carnegie Mellon University
Machine learningMobile roboticsEstimation TheoryComputer Vision
D
Dominik Bauer
Robotics Institute, Carnegie Mellon University, Pittsburgh, PA
Nancy Pollard
Nancy Pollard
Carnegie Mellon University
RoboticsComputer GraphicsHandsDexterous Manipulation