Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation

📅 2024-09-16

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

175K/year

🤖 AI Summary

DETR-based scene graph generation (SGG) suffers from sparse supervision—only ~10 relational annotations per image despite >100 object queries—and false-negative assignments, where suboptimal matches are incorrectly treated as negatives. To address these challenges, we propose a hybrid relation assignment mechanism that jointly optimizes one-to-one matching and IoU-weighted one-to-many assignment within the DETR framework—the first such integration. We further design a lightweight, self-attention-free Hydra Branch decoder that enhances robustness of one-to-many assignment via repeated predictions, and introduce an IoU-aware matching strategy with multi-query collaborative prediction. Our method achieves state-of-the-art performance across three benchmarks: VG150 (mR@50 = 16.0), Open Images V6 (weighted score = 50.1), and GQA (mR@50 = 12.7), significantly mitigating both sparse supervision and false-negative issues in DETR-based SGG.

Technology Category

Application Category

📝 Abstract

DETR introduces a simplified one-stage framework for scene graph generation (SGG) but faces challenges of sparse supervision and false negative samples. The former occurs because each image typically contains fewer than 10 relation annotations, while DETR-based SGG models employ over 100 relation queries. Each ground truth relation is assigned to only one query during training. The latter arises when one ground truth relation may have multiple queries with similar matching scores, leading to suboptimally matched queries being treated as negative samples. To address these, we propose Hydra-SGG, a one-stage SGG method featuring a Hybrid Relation Assignment. This approach combines a One-to-One Relation Assignment with an IoU-based One-to-Many Relation Assignment, increasing positive training samples and mitigating sparse supervision. In addition, we empirically demonstrate that removing self-attention between relation queries leads to duplicate predictions, which actually benefits the proposed One-to-Many Relation Assignment. With this insight, we introduce Hydra Branch, an auxiliary decoder without self-attention layers, to further enhance One-to-Many Relation Assignment by promoting different queries to make the same relation prediction. Hydra-SGG achieves state-of-the-art performance on multiple datasets, including VG150 (16.0 mR@50), Open Images V6 (50.1 weighted score), and GQA (12.7 mR@50).

Problem

Research questions and friction points this paper is trying to address.

Address sparse supervision in SGG

Mitigate false negative samples

Enhance one-to-many relation assignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid Relation Assignment

One-to-Many Relation Assignment

Hydra Branch without self-attention

🔎 Similar Papers

No similar papers found.