🤖 AI Summary
This study addresses the significant performance drop in cross-dataset object detection when transferring models from domain-specific settings—such as driving or aerial imagery—to general-purpose scenarios. It presents the first systematic analysis of such transfers through the lens of “setting specificity,” distinguishing between setting-dependent and setting-agnostic datasets. To disentangle the effects of domain shift and label mismatch, the authors introduce an open-label evaluation protocol that maps predicted categories to target labels via CLIP-based semantic similarity. By integrating both closed- and open-label evaluations, they quantify the contributions of domain-related and label-related errors. Experiments reveal that transfer within similar settings remains stable, whereas cross-setting transfer incurs substantial degradation. The open-label protocol consistently yields modest performance gains, primarily by correcting semantically plausible misclassifications supported by visual evidence.
📝 Abstract
Object detectors often perform well in-distribution, yet degrade sharply on a different benchmark. We study cross-dataset object detection (CD-OD) through a lens of setting specificity. We group benchmarks into setting-agnostic datasets with diverse everyday scenes and setting-specific datasets tied to a narrow environment, and evaluate a standard detector family across all train--test pairs. This reveals a clear structure in CD-OD: transfer within the same setting type is relatively stable, while transfer across setting types drops substantially and is often asymmetric. The most severe breakdowns occur when transferring from specific sources to agnostic targets, and persist after open-label alignment, indicating that domain shift dominates in the hardest regimes. To disentangle domain shift from label mismatch, we compare closed-label transfer with an open-label protocol that maps predicted classes to the nearest target label using CLIP similarity. Open-label evaluation yields consistent but bounded gains, and many corrected cases correspond to semantic near-misses supported by the image evidence. Overall, we provide a principled characterization of CD-OD under setting specificity and practical guidance for evaluating detectors under distribution shift. Code will be released at \href{[https://github.com/Ritabrata04/cdod-icpr.git}{https://github.com/Ritabrata04/cdod-icpr}.