Fractional Correspondence Framework in Detection Transformer

📅 2024-10-28

🏛️ ACM Multimedia

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

DETR’s strict one-to-one assignment via the Hungarian algorithm struggles with varying object densities and scales, leading to missed detections of small objects and duplicate predictions. To address this, we propose Regularized Transport Matching (RTP), a differentiable optimal transport–based paradigm that replaces hard matching with soft, fractional, and many-to-one assignments computed via the Sinkhorn algorithm. RTP introduces a regularized matching cost formulation, enabling end-to-end joint optimization of detection and assignment. By relaxing rigid bipartite matching constraints, RTP significantly improves robustness to scale and density variations. On MS-COCO, RTP achieves +1.7 mAP over DINO-DETR and +3.8 mAP over Deformable DETR, with particularly notable gains in small-object and crowded-scene detection performance.

Technology Category

Application Category

📝 Abstract

The Detection Transformer (DETR), by incorporating the Hungarian algorithm, has significantly simplified the matching process in object detection tasks. This algorithm facilitates optimal one-to-one matching of predicted bounding boxes to ground-truth annotations during training. While effective, this strict matching process does not inherently account for the varying densities and distributions of objects, leading to suboptimal correspondences such as failing to handle multiple detections of the same object or missing small objects. To address this, we propose the Regularized Transport Plan (RTP). RTP introduces a flexible matching strategy that captures the cost of aligning predictions with ground truths to find the most accurate correspondences between these sets. By utilizing the differentiable Sinkhorn algorithm, RTP allows for soft, fractional matching rather than strict one-to-one assignments. This approach enhances the model's capability to manage varying object densities and distributions effectively. Our extensive evaluations on the MS-COCO and VOC benchmarks demonstrate the effectiveness of our approach. RTP-DETR, surpassing the performance of the Deform-DETR and the recently introduced DINO-DETR, achieving absolute gains in mAP of +3.8% and +1.7%, respectively.

Problem

Research questions and friction points this paper is trying to address.

Addresses suboptimal object detection due to strict one-to-one matching.

Proposes flexible matching to handle varying object densities and distributions.

Improves detection accuracy for small and multiple objects.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Regularized Transport Plan for flexible matching

Differentiable Sinkhorn algorithm for soft assignments

Improved handling of object densities and distributions

🔎 Similar Papers

SimPLR: A Simple and Plain Transformer for Efficient Object Detection and Segmentation