Out-of-Distribution Object Detection in Street Scenes via Synthetic Outlier Exposure and Transfer Learning

📅 2026-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that out-of-distribution (OOD) objects in street scenes are frequently misclassified as background by standard object detectors. To tackle this issue, the authors propose SynOE-OD, a novel framework that, for the first time, integrates synthetic anomaly exposure with transfer learning to enable unified detection of both in-distribution (ID) and OOD objects within a single-stage detector. The method leverages Stable Diffusion to generate semantically plausible OOD samples and employs an open-vocabulary detector—such as GroundingDINO—for anomaly exposure training, without requiring any additional auxiliary architectures. Evaluated on established street-scene OOD detection benchmarks, SynOE-OD substantially outperforms the zero-shot performance of existing open-vocabulary detectors and achieves state-of-the-art average precision.

Technology Category

Application Category

📝 Abstract
Out-of-distribution (OOD) object detection is an important yet underexplored task. A reliable object detector should be able to handle OOD objects by localizing and correctly classifying them as OOD. However, a critical issue arises when such atypical objects are completely missed by the object detector and incorrectly treated as background. Existing OOD detection approaches in object detection often rely on complex architectures or auxiliary branches and typically do not provide a framework that treats in-distribution (ID) and OOD in a unified way. In this work, we address these limitations by enabling a single detector to detect OOD objects, that are otherwise silently overlooked, alongside ID objects. We present \textbf{SynOE-OD}, a \textbf{Syn}thetic \textbf{O}utlier-\textbf{E}xposure-based \textbf{O}bject \textbf{D}etection framework, that leverages strong generative models, like Stable Diffusion, and Open-Vocabulary Object Detectors (OVODs) to generate semantically meaningful, object-level data that serve as outliers during training. The generated data is used for transfer-learning to establish strong ID task performance and supplement detection models with OOD object detection robustness. Our approach achieves state-of-the-art average precision on an established OOD object detection benchmark, where OVODs, such as GroundingDINO, show limited zero-shot performance in detecting OOD objects in street-scenes.
Problem

Research questions and friction points this paper is trying to address.

Out-of-Distribution Object Detection
Street Scenes
Object Detection
OOD Detection
Unknown Objects
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthetic Outlier Exposure
Out-of-Distribution Detection
Transfer Learning
Open-Vocabulary Object Detection
Generative Models
🔎 Similar Papers
S
Sadia Ilyas
University of Wuppertal; Aptiv Services Deutschland GmbH
Annika Mütze
Annika Mütze
University of Wuppertal
K
Klaus Friedrichs
Aptiv Services Deutschland GmbH
T
Thomas Kurbiel
Aptiv Services Deutschland GmbH
Matthias Rottmann
Matthias Rottmann
Professor of Computer Science, Osnabrück University, Germany
Computer VisionDeep LearningSafe AIEfficient AINumerical Linear Algebra