S$^2$Teacher: Step-by-step Teacher for Sparsely Annotated Oriented Object Detection

📅 2025-04-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Fully supervised oriented object detection in remote sensing imagery incurs prohibitively high annotation costs, while existing weakly and semi-supervised methods fail to address the critical challenge of sparse annotations in dense-scene scenarios. Method: This paper introduces Sparse-Annotation Oriented Object Detection (SAOOD), a novel learning paradigm targeting two key challenges under extremely sparse rotation-instance labeling: foreground representation overfitting and interference from unlabeled objects (false negatives). To tackle these, we propose a stepwise teacher model integrating curriculum-driven progressive pseudo-label generation, consistency regularization, dynamic pseudo-label mining, and instance-level reweighting of losses for unlabeled objects. Contribution/Results: On the DOTA benchmark, our method achieves near fully supervised performance using only 10% of annotated samples—substantially outperforming state-of-the-art weakly and semi-supervised approaches—and establishes an effective trade-off between annotation efficiency and detection accuracy.

Technology Category

Application Category

📝 Abstract
Although fully-supervised oriented object detection has made significant progress in multimodal remote sensing image understanding, it comes at the cost of labor-intensive annotation. Recent studies have explored weakly and semi-supervised learning to alleviate this burden. However, these methods overlook the difficulties posed by dense annotations in complex remote sensing scenes. In this paper, we introduce a novel setting called sparsely annotated oriented object detection (SAOOD), which only labels partial instances, and propose a solution to address its challenges. Specifically, we focus on two key issues in the setting: (1) sparse labeling leading to overfitting on limited foreground representations, and (2) unlabeled objects (false negatives) confusing feature learning. To this end, we propose the S$^2$Teacher, a novel method that progressively mines pseudo-labels for unlabeled objects, from easy to hard, to enhance foreground representations. Additionally, it reweights the loss of unlabeled objects to mitigate their impact during training. Extensive experiments demonstrate that S$^2$Teacher not only significantly improves detector performance across different sparse annotation levels but also achieves near-fully-supervised performance on the DOTA dataset with only 10% annotation instances, effectively balancing detection accuracy with annotation efficiency. The code will be public.
Problem

Research questions and friction points this paper is trying to address.

Solves sparsely annotated oriented object detection in remote sensing
Addresses overfitting from sparse labeling and false negatives
Enhances detection accuracy with minimal annotation effort
Innovation

Methods, ideas, or system contributions that make the work stand out.

Progressively mines pseudo-labels for unlabeled objects
Reweights loss to mitigate unlabeled objects' impact
Enhances foreground representations from easy to hard
🔎 Similar Papers
No similar papers found.
Y
Yu Lin
School of Informatics, Xiamen University, Xiamen, China
Jianghang Lin
Jianghang Lin
Xiamen University
Multimodal Large Language ModelVision-Language ModelSemi/Weakly-Supervised Learning
K
Kai Ye
School of Informatics, Xiamen University, Xiamen, China
You Shen
You Shen
Xiamen University
3DV
Y
Yan Zhang
School of Informatics, Xiamen University, Xiamen, China
Shengchuan Zhang
Shengchuan Zhang
Xiamen University
computer visionmachine learning
L
Liujuan Cao
School of Informatics, Xiamen University, Xiamen, China
R
Rongrong Ji
School of Informatics, Xiamen University, Xiamen, China