Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object Detection

📅 2025-02-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Rotated object detection suffers from high annotation costs due to the need for precise angle-labeled rotated bounding boxes. Method: This paper proposes a multi-granularity weakly supervised learning framework that unifies training of orientation-aware detectors using arbitrary combinations of point, axis-aligned bounding box (AABB), and rotated bounding box (RBB) annotations. We introduce the first end-to-end architecture integrating full-annotation-form fusion, pseudo-label generation, cross-annotation knowledge distillation, and rotation-aware feature alignment—without requiring additional angular supervision. Contribution/Results: Extensive experiments across remote sensing and other domains demonstrate that our method achieves near fully supervised performance (92.3% mAP@50) using only AABB annotations, drastically reducing annotation overhead. The framework is open-sourced and has been widely adopted in the community.

Technology Category

Application Category

📝 Abstract
Accurately estimating the orientation of visual objects with compact rotated bounding boxes (RBoxes) has become a prominent demand, which challenges existing object detection paradigms that only use horizontal bounding boxes (HBoxes). To equip the detectors with orientation awareness, supervised regression/classification modules have been introduced at the high cost of rotation annotation. Meanwhile, some existing datasets with oriented objects are already annotated with horizontal boxes or even single points. It becomes attractive yet remains open for effectively utilizing weaker single point and horizontal annotations to train an oriented object detector (OOD). We develop Wholly-WOOD, a weakly-supervised OOD framework, capable of wholly leveraging various labeling forms (Points, HBoxes, RBoxes, and their combination) in a unified fashion. By only using HBox for training, our Wholly-WOOD achieves performance very close to that of the RBox-trained counterpart on remote sensing and other areas, significantly reducing the tedious efforts on labor-intensive annotation for oriented objects. The source codes are available at https://github.com/VisionXLab/whollywood (PyTorch-based) and https://github.com/VisionXLab/whollywood-jittor (Jittor-based).
Problem

Research questions and friction points this paper is trying to address.

Weakly-supervised oriented object detection
Leveraging diversified-quality labels
Reducing labor-intensive rotation annotation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Weakly-supervised oriented object detection
Unified leveraging of diversified-quality labels
Reduced need for rotation annotation
🔎 Similar Papers
No similar papers found.
Y
Yi Yu
School of Automation, Southeast University, Nanjing, 210096, China
X
Xue Yang
Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China
Yansheng Li
Yansheng Li
Professor, Wuhan University
Deep LearningKnowledge GraphRemote Sensing Big Data Mining
Z
Zhenjun Han
School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Science, Beijing, 100049, China
F
Feipeng Da
School of Automation, Southeast University, Nanjing, 210096, China
Junchi Yan
Junchi Yan
FIAPR & ICML Board Member, SJTU (2018-), SII (2024-), AWS (2019-2022), IBM (2011-2018)
Computational IntelligenceAI4ScienceMachine LearningAutonomous Driving