Integrating Diverse Assignment Strategies into DETRs

📅 2026-01-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the sparse supervision and slow convergence of DETR-style detectors caused by one-to-one label assignment, as well as the structural complexity and limited diversity of existing one-to-many approaches. The authors propose LoRA-DETR, which introduces multiple low-rank adaptation (LoRA) branches during training to enable parallel, diverse one-to-many label assignment strategies, thereby enriching the supervision signal. These auxiliary branches are removed at inference time, preserving a lightweight model architecture. Notably, the method requires no modification to the backbone network and achieves parameter-efficient, architecture-agnostic integration of multiple assignment strategies. The study demonstrates that supervision diversity—not merely quantity—is key to performance gains, achieving state-of-the-art results across multiple DETR baselines without incurring additional inference overhead.

Technology Category

Application Category

📝 Abstract
Label assignment is a critical component in object detectors, particularly within DETR-style frameworks where the one-to-one matching strategy, despite its end-to-end elegance, suffers from slow convergence due to sparse supervision. While recent works have explored one-to-many assignments to enrich supervisory signals, they often introduce complex, architecture-specific modifications and typically focus on a single auxiliary strategy, lacking a unified and scalable design. In this paper, we first systematically investigate the effects of ``one-to-many''supervision and reveal a surprising insight that performance gains are driven not by the sheer quantity of supervision, but by the diversity of the assignment strategies employed. This finding suggests that a more elegant, parameter-efficient approach is attainable. Building on this insight, we propose LoRA-DETR, a flexible and lightweight framework that seamlessly integrates diverse assignment strategies into any DETR-style detector. Our method augments the primary network with multiple Low-Rank Adaptation (LoRA) branches during training, each instantiating a different one-to-many assignment rule. These branches act as auxiliary modules that inject rich, varied supervisory gradients into the main model and are discarded during inference, thus incurring no additional computational cost. This design promotes robust joint optimization while maintaining the architectural simplicity of the original detector. Extensive experiments on different baselines validate the effectiveness of our approach. Our work presents a new paradigm for enhancing detectors, demonstrating that diverse ``one-to-many''supervision can be integrated to achieve state-of-the-art results without compromising model elegance.
Problem

Research questions and friction points this paper is trying to address.

label assignment
DETR
one-to-many supervision
object detection
sparse supervision
Innovation

Methods, ideas, or system contributions that make the work stand out.

label assignment diversity
one-to-many supervision
LoRA-DETR
low-rank adaptation
DETR-style detectors
🔎 Similar Papers
No similar papers found.
Y
Yiwei Zhang
State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), CASIA, School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing Key Laboratory of Super Intelligent Security of Multi-Modal Information
J
Jin Gao
State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), CASIA, School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing Key Laboratory of Super Intelligent Security of Multi-Modal Information
H
Hanshi Wang
State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), CASIA, School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing Key Laboratory of Super Intelligent Security of Multi-Modal Information
F
Fudong Ge
State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), CASIA, School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing Key Laboratory of Super Intelligent Security of Multi-Modal Information
Guan Luo
Guan Luo
Tsinghua University
3D generation
Weiming Hu
Weiming Hu
Shanghai Jiao Tong University
Computer Architecture
Zhipeng Zhang
Zhipeng Zhang
School of Artificial Intelligence, Shanghai Jiao Tong University
Computer Vision,Object Tracking and Segmentation