Point2RBox-v3: Self-Bootstrapping from Point Annotations via Integrated Pseudo-Label Refinement and Utilization

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Weakly supervised oriented object detection under point-level annotations faces two major bottlenecks: inefficient utilization and low quality of pseudo-labels. To address these, this paper proposes a bootstrapping pseudo-label optimization framework. First, it introduces a dynamic pseudo-label assignment mechanism, incorporating a progressive label assignment (PLA) strategy and a prior-guided dynamic mask loss that jointly leverages SAM’s semantic consistency and watershed-based local structural modeling. Second, it designs a pseudo-label refinement module to enhance robustness in both sparse and dense scenarios for localization and classification. Extensive experiments on remote sensing benchmarks—including DOTA-v1.0, DIOR, and STAR—demonstrate significant improvements over existing weakly supervised methods, particularly under challenging conditions such as large intra-class scale variation and highly sparse object distributions.

Technology Category

Application Category

📝 Abstract

Driven by the growing need for Oriented Object Detection (OOD), learning from point annotations under a weakly-supervised framework has emerged as a promising alternative to costly and laborious manual labeling. In this paper, we discuss two deficiencies in existing point-supervised methods: inefficient utilization and poor quality of pseudo labels. Therefore, we present Point2RBox-v3. At the core are two principles: 1) Progressive Label Assignment (PLA). It dynamically estimates instance sizes in a coarse yet intelligent manner at different stages of the training process, enabling the use of label assignment methods. 2) Prior-Guided Dynamic Mask Loss (PGDM-Loss). It is an enhancement of the Voronoi Watershed Loss from Point2RBox-v2, which overcomes the shortcomings of Watershed in its poor performance in sparse scenes and SAM's poor performance in dense scenes. To our knowledge, Point2RBox-v3 is the first model to employ dynamic pseudo labels for label assignment, and it creatively complements the advantages of SAM model with the watershed algorithm, which achieves excellent performance in both sparse and dense scenes. Our solution gives competitive performance, especially in scenarios with large variations in object size or sparse object occurrences: 66.09%/56.86%/41.28%/46.40%/19.60%/45.96% on DOTA-v1.0/DOTA-v1.5/DOTA-v2.0/DIOR/STAR/RSAR.

Problem

Research questions and friction points this paper is trying to address.

Improving pseudo-label quality in point-supervised oriented object detection

Enhancing pseudo-label utilization efficiency for weakly-supervised learning

Overcoming performance limitations in both sparse and dense scenes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Progressive Label Assignment dynamically estimates instance sizes

Prior-Guided Dynamic Mask Loss enhances Voronoi Watershed algorithm

Self-bootstrapping integrates pseudo-label refinement and utilization

🔎 Similar Papers

No similar papers found.