🤖 AI Summary
This paper addresses unsupervised domain adaptation (UDA) for DETR-based object detection under source-free settings—i.e., without access to source-domain data. We propose FRANCK, a novel framework specifically designed for Detection Transformers. Its core contributions include: (1) objectness-aware sample reweighting; (2) contrastive learning guided by a match-aware memory bank; (3) uncertainty-weighted query fusion distillation; and (4) dynamic teacher model updating. FRANCK deeply integrates attention mechanisms, multi-scale feature reweighting, and self-training with optimized pseudo-labels. Evaluated on multiple benchmarks—including PASCAL VOC and Cityscapes—FRANCK achieves state-of-the-art performance in source-free UDA for DETR. It significantly enhances robustness and generalization when source data is unavailable, demonstrating superior cross-domain transfer capability while preserving DETR’s end-to-end architecture and training paradigm.
📝 Abstract
Source-Free Object Detection (SFOD) enables knowledge transfer from a source domain to an unsupervised target domain for object detection without access to source data. Most existing SFOD approaches are either confined to conventional object detection (OD) models like Faster R-CNN or designed as general solutions without tailored adaptations for novel OD architectures, especially Detection Transformer (DETR). In this paper, we introduce Feature Reweighting ANd Contrastive Learning NetworK (FRANCK), a novel SFOD framework specifically designed to perform query-centric feature enhancement for DETRs. FRANCK comprises four key components: (1) an Objectness Score-based Sample Reweighting (OSSR) module that computes attention-based objectness scores on multi-scale encoder feature maps, reweighting the detection loss to emphasize less-recognized regions; (2) a Contrastive Learning with Matching-based Memory Bank (CMMB) module that integrates multi-level features into memory banks, enhancing class-wise contrastive learning; (3) an Uncertainty-weighted Query-fused Feature Distillation (UQFD) module that improves feature distillation through prediction quality reweighting and query feature fusion; and (4) an improved self-training pipeline with a Dynamic Teacher Updating Interval (DTUI) that optimizes pseudo-label quality. By leveraging these components, FRANCK effectively adapts a source-pre-trained DETR model to a target domain with enhanced robustness and generalization. Extensive experiments on several widely used benchmarks demonstrate that our method achieves state-of-the-art performance, highlighting its effectiveness and compatibility with DETR-based SFOD models.