Towards Unbiased Source-Free Object Detection via Vision Foundation Models

📅 2026-01-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited generalization and error accumulation in self-training commonly observed in source-free object detection (SFOD), where models overly rely on source-domain knowledge. To mitigate source-domain bias, we propose the Debiased Source-free Object Detection (DSOD) framework, which leverages a vision foundation model (VFM) for the first time in this context. DSOD introduces a Unified Feature Injection (UFI) module and a Semantic-Aware Feature Regularization (SAFR) mechanism to enhance cross-domain representation consistency. Furthermore, a dual-teacher distillation scheme is designed to enable efficient deployment without requiring the VFM at inference time. Extensive experiments demonstrate state-of-the-art performance across multiple cross-domain benchmarks, achieving 48.1% AP on Normal-to-Foggy, 39.3% AP on Cross-scene, and 61.4% AP on Synthetic-to-Real settings.

Technology Category

Application Category

📝 Abstract
Source-Free Object Detection (SFOD) has garnered much attention in recent years by eliminating the need of source-domain data in cross-domain tasks, but existing SFOD methods suffer from the Source Bias problem, i.e. the adapted model remains skewed towards the source domain, leading to poor generalization and error accumulation during self-training. To overcome this challenge, we propose Debiased Source-free Object Detection (DSOD), a novel VFM-assisted SFOD framework that can effectively mitigate source bias with the help of powerful VFMs. Specifically, we propose Unified Feature Injection (UFI) module that integrates VFM features into the CNN backbone through Simple-Scale Extension (SSE) and Domain-aware Adaptive Weighting (DAAW). Then, we propose Semantic-aware Feature Regularization (SAFR) that constrains feature learning to prevent overfitting to source domain characteristics. Furthermore, we propose a VFM-free variant, termed DSOD-distill for computation-restricted scenarios through a novel Dual-Teacher distillation scheme. Extensive experiments on multiple benchmarks demonstrate that DSOD outperforms state-of-the-art SFOD methods, achieving 48.1% AP on Normal-to-Foggy weather adaptation, 39.3% AP on Cross-scene adaptation, and 61.4% AP on Synthetic-to-Real adaptation.
Problem

Research questions and friction points this paper is trying to address.

Source-Free Object Detection
Source Bias
Domain Adaptation
Object Detection
Cross-domain Generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Source-Free Object Detection
Vision Foundation Models
Domain Adaptation
Feature Injection
Knowledge Distillation
🔎 Similar Papers
No similar papers found.