Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances

📅 2025-02-06

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

This work addresses the lack of instance-level spatial layout modeling in oriented object detection (OOD) under point supervision. We propose the first weakly supervised learning framework leveraging geometric relationships among instances. Methodologically, we innovatively integrate Gaussian distribution modeling, Voronoi tessellation, and watershed analysis to formulate a triple-constraint loss: (i) Gaussian overlap loss for modeling instance density distributions; (ii) Voronoi watershed loss to capture spatial competition; and (iii) multi-view consistency loss to enhance geometric robustness. Additionally, edge-aware loss and copy-paste augmentation are introduced to improve boundary localization accuracy. Our approach achieves state-of-the-art mAP scores of 62.61% on DOTA, 86.15% on HRSC, and 34.71% on FAIR1M—demonstrating significant improvements for dense, rotated object detection. The method exhibits strong robustness to varying instance densities while maintaining computational efficiency and model lightweightness.

Technology Category

Application Category

📝 Abstract

With the rapidly increasing demand for oriented object detection (OOD), recent research involving weakly-supervised detectors for learning OOD from point annotations has gained great attention. In this paper, we rethink this challenging task setting with the layout among instances and present Point2RBox-v2. At the core are three principles: 1) Gaussian overlap loss. It learns an upper bound for each instance by treating objects as 2D Gaussian distributions and minimizing their overlap. 2) Voronoi watershed loss. It learns a lower bound for each instance through watershed on Voronoi tessellation. 3) Consistency loss. It learns the size/rotation variation between two output sets with respect to an input image and its augmented view. Supplemented by a few devised techniques, e.g. edge loss and copy-paste, the detector is further enhanced.To our best knowledge, Point2RBox-v2 is the first approach to explore the spatial layout among instances for learning point-supervised OOD. Our solution is elegant and lightweight, yet it is expected to give a competitive performance especially in densely packed scenes: 62.61%/86.15%/34.71% on DOTA/HRSC/FAIR1M. Code is available at https://github.com/VisionXLab/point2rbox-v2.

Problem

Research questions and friction points this paper is trying to address.

Improving oriented object detection with point annotations.

Exploring spatial layout among instances for detection.

Enhancing detector performance in densely packed scenes.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian overlap loss

Voronoi watershed loss

Consistency loss

🔎 Similar Papers

Oriented Object Detection in Optical Remote Sensing Images using Deep Learning: A Survey