Beyond Frequency: Scoring-Driven Debiasing for Object Detection via Blueprint-Prompted Image Synthesis

📅 2025-10-20

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

To address long-tailed performance degradation in object detection caused by data bias, existing generative augmentation methods suffer from insufficient representation diversity and low fidelity/controllability in layout-to-image synthesis. This paper proposes a scoring-driven generative debiasing framework. First, a Representation Score is introduced to quantify model representation gaps, guiding the construction of unbiased scene layouts. Then, visual blueprints—rather than textual prompts—are employed as layout conditioning, integrated with a generation alignment mechanism to achieve high-fidelity, controllable synthesis of complex scenes. The framework jointly optimizes the detector and the generative model through interactive learning. Experiments demonstrate consistent improvements: +4.4 mAP on large categories and +3.6 mAP on rare categories. Moreover, layout generation accuracy surpasses the state-of-the-art layout-to-image model by 15.9 mAP, significantly mitigating long-tail bias.

Technology Category

Application Category

📝 Abstract

This paper presents a generation-based debiasing framework for object detection. Prior debiasing methods are often limited by the representation diversity of samples, while naive generative augmentation often preserves the biases it aims to solve. Moreover, our analysis reveals that simply generating more data for rare classes is suboptimal due to two core issues: i) instance frequency is an incomplete proxy for the true data needs of a model, and ii) current layout-to-image synthesis lacks the fidelity and control to generate high-quality, complex scenes. To overcome this, we introduce the representation score (RS) to diagnose representational gaps beyond mere frequency, guiding the creation of new, unbiased layouts. To ensure high-quality synthesis, we replace ambiguous text prompts with a precise visual blueprint and employ a generative alignment strategy, which fosters communication between the detector and generator. Our method significantly narrows the performance gap for underrepresented object groups, eg, improving large/rare instances by 4.4/3.6 mAP over the baseline, and surpassing prior L2I synthesis models by 15.9 mAP for layout accuracy in generated images.

Problem

Research questions and friction points this paper is trying to address.

Overcoming frequency-based limitations in object detection debiasing methods

Addressing low fidelity in layout-to-image synthesis for complex scenes

Reducing performance gaps for underrepresented object groups

Innovation

Methods, ideas, or system contributions that make the work stand out.

Blueprint-prompted image synthesis replaces text prompts

Representation score guides unbiased layout creation

Generative alignment strategy connects detector and generator

🔎 Similar Papers

TextureCrop: Enhancing Synthetic Image Detection through Texture-based Cropping

2024-07-22arXiv.orgCitations: 1

Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective

2024-08-13arXiv.orgCitations: 0

Bosch Group

Renningen, BW, DE

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)