YOLO26-RipeLoc Lite: A lightweight architecture for tomato ripeness detection and picking point localization in greenhouse robotic harvesting

📅 2026-05-26

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This study addresses the integrated challenge of maturity recognition, classification, and grasp-point localization for autonomous harvesting of greenhouse tomatoes by proposing a lightweight YOLO variant that simultaneously performs detection, maturity classification, and center grasp-point regression. The model incorporates several novel components: a Lightweight Feature Pyramid Network (LFPN), a Ripeness-Aware Attention Module (RAAM), and a Compact Detection Head (CDH), leveraging depthwise separable convolutions, dual-pooling attention, learnable maturity bias, and HSV-based data augmentation. A staged unfreezing training strategy is employed to enhance optimization. Evaluated on a dataset of 1,500 images, the model achieves 92.9% mAP@0.5 with only 2.38 million parameters; after pruning 30% of BatchNorm layers, the parameter count reduces to 1.8 million with negligible accuracy loss, significantly outperforming existing YOLO-based approaches.

📝 Abstract

In greenhouse tomato production, automated harvesting requires accurate detection of ripe tomatoes, ripeness classification, and precise picking-point localization for robotic end-effectors. This paper proposes YOLO26-RipeLoc Lite, a lightweight deep learning architecture based on YOLO26 for simultaneous detection, ripeness classification, and center-point localization of greenhouse tomatoes. The model introduces three modifications: (1) a Lightweight Feature Pyramid Network (LFPN) with depthwise separable convolutions for efficient multi-scale fusion, (2) a Ripeness-Aware Attention Module (RAAM) with dual pooling and a learnable ripeness bias vector for enhanced color-texture discrimination, and (3) a Compact Detection Head (CDH) with shared convolutions and an integrated center-point regression branch for direct grasp planning. The model is evaluated on a custom dataset of 1,500 images with 6,227 instances (3,566 ripe, 2,661 unripe) from the SILAL greenhouse, Abu Dhabi, UAE. YOLO26-RipeLoc Lite achieves mAP@0.5 of 92.9% (95.2% ripe, 90.6% unripe) with the highest precision (95.2%) among all evaluated architectures using only 2.38M parameters. Post-training BatchNorm pruning at 30% reduces parameters to ~1.8M with negligible accuracy loss. Ablation studies confirm that greenhouse-aware HSV augmentation provides the largest improvement (+2.02 pp mAP@50), backbone freezing achieves peak precision (93.8%), and 3-phase progressive unfreezing yields the best localization quality (mAP@50:95 of 64.6%). Comparisons with YOLOv8n/s, YOLO11n/s, YOLO12n/s, and YOLO26s confirm superior accuracy-efficiency: 2.9 pp higher precision than YOLO12n with 7.0% fewer parameters and integrated center-point localization for robotic end-effector guidance.

Problem

Research questions and friction points this paper is trying to address.

tomato ripeness detection

picking point localization

greenhouse robotic harvesting

automated harvesting

ripe tomato classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight Feature Pyramid Network

Ripeness-Aware Attention Module

Compact Detection Head