🤖 AI Summary
Existing cross-domain instance mask annotation methods suffer from low efficiency, poor generalization, and reliance on target-domain fine-tuning. Method: This paper proposes a one-shot cross-domain instance annotation framework based on a Siamese network—the first to adapt the Siamese architecture to instance-level mask annotation—enabling strong zero-shot cross-domain generalization without target-domain adaptation. It integrates boundary-aware mask prediction with human-in-the-loop interaction, allowing users to draw bounding boxes and instantly receive high-precision contours, along with intuitive correction interfaces. Results: Extensive experiments across multiple heterogeneous datasets demonstrate that our method significantly outperforms state-of-the-art approaches, achieving breakthrough improvements in annotation accuracy, cross-domain generalization capability, and interactive efficiency.
📝 Abstract
Annotating instance masks is time-consuming and labor-intensive. A promising solution is to predict contours using a deep learning model and then allow users to refine them. However, most existing methods focus on in-domain scenarios, limiting their effectiveness for cross-domain annotation tasks. In this paper, we propose SiamAnno, a framework inspired by the use of Siamese networks in object tracking. SiamAnno leverages one-shot learning to annotate previously unseen objects by taking a bounding box as input and predicting object boundaries, which can then be adjusted by annotators. Trained on one dataset and tested on another without fine-tuning, SiamAnno achieves state-of-the-art (SOTA) performance across multiple datasets, demonstrating its ability to handle domain and environment shifts in cross-domain tasks. We also provide more comprehensive results compared to previous work, establishing a strong baseline for future research. To our knowledge, SiamAnno is the first model to explore Siamese architecture for instance annotation.