Point2RBox-v3: Self-Bootstrapping from Point Annotations via Integrated Pseudo-Label Refinement and Utilization

📅 2025-09-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Weakly supervised oriented object detection under point-level annotations faces two major bottlenecks: inefficient utilization and low quality of pseudo-labels. To address these, this paper proposes a bootstrapping pseudo-label optimization framework. First, it introduces a dynamic pseudo-label assignment mechanism, incorporating a progressive label assignment (PLA) strategy and a prior-guided dynamic mask loss that jointly leverages SAM’s semantic consistency and watershed-based local structural modeling. Second, it designs a pseudo-label refinement module to enhance robustness in both sparse and dense scenarios for localization and classification. Extensive experiments on remote sensing benchmarks—including DOTA-v1.0, DIOR, and STAR—demonstrate significant improvements over existing weakly supervised methods, particularly under challenging conditions such as large intra-class scale variation and highly sparse object distributions.

Technology Category

Application Category

📝 Abstract
Driven by the growing need for Oriented Object Detection (OOD), learning from point annotations under a weakly-supervised framework has emerged as a promising alternative to costly and laborious manual labeling. In this paper, we discuss two deficiencies in existing point-supervised methods: inefficient utilization and poor quality of pseudo labels. Therefore, we present Point2RBox-v3. At the core are two principles: 1) Progressive Label Assignment (PLA). It dynamically estimates instance sizes in a coarse yet intelligent manner at different stages of the training process, enabling the use of label assignment methods. 2) Prior-Guided Dynamic Mask Loss (PGDM-Loss). It is an enhancement of the Voronoi Watershed Loss from Point2RBox-v2, which overcomes the shortcomings of Watershed in its poor performance in sparse scenes and SAM's poor performance in dense scenes. To our knowledge, Point2RBox-v3 is the first model to employ dynamic pseudo labels for label assignment, and it creatively complements the advantages of SAM model with the watershed algorithm, which achieves excellent performance in both sparse and dense scenes. Our solution gives competitive performance, especially in scenarios with large variations in object size or sparse object occurrences: 66.09%/56.86%/41.28%/46.40%/19.60%/45.96% on DOTA-v1.0/DOTA-v1.5/DOTA-v2.0/DIOR/STAR/RSAR.
Problem

Research questions and friction points this paper is trying to address.

Improving pseudo-label quality in point-supervised oriented object detection
Enhancing pseudo-label utilization efficiency for weakly-supervised learning
Overcoming performance limitations in both sparse and dense scenes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Progressive Label Assignment dynamically estimates instance sizes
Prior-Guided Dynamic Mask Loss enhances Voronoi Watershed algorithm
Self-bootstrapping integrates pseudo-label refinement and utilization
🔎 Similar Papers
No similar papers found.
T
Teng Zhang
Shanghai Jiao Tong University
Z
Ziqian Fan
South China University of Technology
M
Mingxin Liu
Shanghai Jiao Tong University
X
Xin Zhang
Nankai University
Xudong Lu
Xudong Lu
PhD student, the Chinese University of Hong Kong
Computer VisionMachine Learning
Wentong Li
Wentong Li
Nanjing University of Aeronautics and Astronautics
Computer VisionMachine LearningVision-Language ModelRobotics
Y
Yue Zhou
East China Normal University
Y
Yi Yu
Ohio State University
X
Xiang Li
Nankai University
Junchi Yan
Junchi Yan
FIAPR & ICML Board Member, SJTU (2018-), SII (2024-), AWS (2019-2022), IBM (2011-2018)
Computational IntelligenceAI4ScienceMachine LearningAutonomous Driving
X
Xue Yang
Shanghai Jiao Tong University