🤖 AI Summary
To address the poor robustness and heavy reliance on labeled data for ball detection in dynamic outdoor RoboCup scenarios, this paper proposes a low-resource adaptive detection framework. The method integrates multi-task self-supervised pretraining (color reconstruction, edge prediction, and triplet contrastive learning), pseudo-label distillation, and Model-Agnostic Meta-Learning (MAML) to enable end-to-end feature extraction and rapid domain adaptation. Evaluated on a newly constructed 10,000-image RoboCup Standard Platform League (SPL) dataset, our approach significantly outperforms supervised baselines—achieving +12.3% detection accuracy, +14.7% F1-score, +16.1% IoU, and 40% faster convergence. This work is the first to synergistically combine multi-task self-supervision, pseudo-label distillation, and MAML for humanoid robot vision-based ball detection, delivering a lightweight, generalizable solution for resource-constrained dynamic environments.
📝 Abstract
Robust and accurate ball detection is a critical component for autonomous humanoid soccer robots, particularly in dynamic and challenging environments such as RoboCup outdoor fields. However, traditional supervised approaches require extensive manual annotation, which is costly and time-intensive. To overcome this problem, we present a self-supervised learning framework for domain-adaptive feature extraction to enhance ball detection performance. The proposed approach leverages a general-purpose pretrained model to generate pseudo-labels, which are then used in a suite of self-supervised pretext tasks -- including colorization, edge detection, and triplet loss -- to learn robust visual features without relying on manual annotations. Additionally, a model-agnostic meta-learning (MAML) strategy is incorporated to ensure rapid adaptation to new deployment scenarios with minimal supervision. A new dataset comprising 10,000 labeled images from outdoor RoboCup SPL matches is introduced, used to validate the method, and made available to the community. Experimental results demonstrate that the proposed pipeline outperforms baseline models in terms of accuracy, F1 score, and IoU, while also exhibiting faster convergence.