Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances

๐Ÿ“… 2025-02-06
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the lack of instance-level spatial layout modeling in oriented object detection (OOD) under point supervision. We propose the first weakly supervised learning framework leveraging geometric relationships among instances. Methodologically, we innovatively integrate Gaussian distribution modeling, Voronoi tessellation, and watershed analysis to formulate a triple-constraint loss: (i) Gaussian overlap loss for modeling instance density distributions; (ii) Voronoi watershed loss to capture spatial competition; and (iii) multi-view consistency loss to enhance geometric robustness. Additionally, edge-aware loss and copy-paste augmentation are introduced to improve boundary localization accuracy. Our approach achieves state-of-the-art mAP scores of 62.61% on DOTA, 86.15% on HRSC, and 34.71% on FAIR1Mโ€”demonstrating significant improvements for dense, rotated object detection. The method exhibits strong robustness to varying instance densities while maintaining computational efficiency and model lightweightness.

Technology Category

Application Category

๐Ÿ“ Abstract
With the rapidly increasing demand for oriented object detection (OOD), recent research involving weakly-supervised detectors for learning OOD from point annotations has gained great attention. In this paper, we rethink this challenging task setting with the layout among instances and present Point2RBox-v2. At the core are three principles: 1) Gaussian overlap loss. It learns an upper bound for each instance by treating objects as 2D Gaussian distributions and minimizing their overlap. 2) Voronoi watershed loss. It learns a lower bound for each instance through watershed on Voronoi tessellation. 3) Consistency loss. It learns the size/rotation variation between two output sets with respect to an input image and its augmented view. Supplemented by a few devised techniques, e.g. edge loss and copy-paste, the detector is further enhanced.To our best knowledge, Point2RBox-v2 is the first approach to explore the spatial layout among instances for learning point-supervised OOD. Our solution is elegant and lightweight, yet it is expected to give a competitive performance especially in densely packed scenes: 62.61%/86.15%/34.71% on DOTA/HRSC/FAIR1M. Code is available at https://github.com/VisionXLab/point2rbox-v2.
Problem

Research questions and friction points this paper is trying to address.

Improving oriented object detection with point annotations.
Exploring spatial layout among instances for detection.
Enhancing detector performance in densely packed scenes.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian overlap loss
Voronoi watershed loss
Consistency loss
๐Ÿ”Ž Similar Papers
No similar papers found.
Y
Yi Yu
Southeast University
Botao Ren
Botao Ren
Tsinghua University
Computer VisionObject DetectionRemote Sensing
P
Peiyuan Zhang
Wuhan University
M
Mingxin Liu
Shanghai Jiao Tong University
Junwei Luo
Junwei Luo
Wuhan University
Vision-Language ModelOriented Object DetectionRemote Sensing
S
Shaofeng Zhang
Shanghai Jiao Tong University
F
Feipeng Da
Southeast University
Junchi Yan
Junchi Yan
FIAPR & ICML Board Member, SJTU (2018-), SII (2024-), AWS (2019-2022), IBM (2011-2018)
Computational IntelligenceAI4ScienceMachine LearningAutonomous Driving
X
Xue Yang
Shanghai Jiao Tong University