CA-Cut: Crop-Aligned Cutout for Data Augmentation to Learn More Robust Under-Canopy Navigation

📅 2025-07-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Visual navigation models for agricultural fields suffer from poor robustness under occlusions, clutter, and irregular crop spacing, and heavily rely on large-scale annotated datasets. Method: This paper proposes a crop-aligned random masking augmentation technique tailored for semantic keypoint prediction. Unlike conventional masking strategies, our approach constrains masked regions exclusively to both sides of crop rows to faithfully emulate real-world occlusion patterns, while jointly optimizing mask count, size, and spatial distribution. Contribution/Results: Evaluated on a public maize field dataset, the method reduces keypoint prediction error by up to 36.9%, significantly enhancing model generalization and robustness in complex, unstructured field environments. It offers a novel, low-data-dependency paradigm for agricultural visual navigation.

Technology Category

Application Category

📝 Abstract
State-of-the-art visual under-canopy navigation methods are designed with deep learning-based perception models to distinguish traversable space from crop rows. While these models have demonstrated successful performance, they require large amounts of training data to ensure reliability in real-world field deployment. However, data collection is costly, demanding significant human resources for in-field sampling and annotation. To address this challenge, various data augmentation techniques are commonly employed during model training, such as color jittering, Gaussian blur, and horizontal flip, to diversify training data and enhance model robustness. In this paper, we hypothesize that utilizing only these augmentation techniques may lead to suboptimal performance, particularly in complex under-canopy environments with frequent occlusions, debris, and non-uniform spacing of crops. Instead, we propose a novel augmentation method, so-called Crop-Aligned Cutout (CA-Cut) which masks random regions out in input images that are spatially distributed around crop rows on the sides to encourage trained models to capture high-level contextual features even when fine-grained information is obstructed. Our extensive experiments with a public cornfield dataset demonstrate that masking-based augmentations are effective for simulating occlusions and significantly improving robustness in semantic keypoint predictions for visual navigation. In particular, we show that biasing the mask distribution toward crop rows in CA-Cut is critical for enhancing both prediction accuracy and generalizability across diverse environments achieving up to a 36.9% reduction in prediction error. In addition, we conduct ablation studies to determine the number of masks, the size of each mask, and the spatial distribution of masks to maximize overall performance.
Problem

Research questions and friction points this paper is trying to address.

Enhancing robustness in under-canopy navigation models
Reducing reliance on costly real-world data collection
Improving accuracy in occluded crop row environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Crop-Aligned Cutout masks random regions
Biases mask distribution toward crop rows
Enhances robustness in keypoint predictions
🔎 Similar Papers
No similar papers found.