RETR: Multi-View Radar Detection Transformer for Indoor Perception

📅 2024-11-15
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the insufficient modeling of multi-view geometry and radar signal characteristics in indoor multi-view radar perception, this paper proposes RADAR-DETR—the first end-to-end radar detection Transformer framework. It introduces three key innovations: depth-prioritized positional encoding, a radar–camera dual-coordinate triplane loss for geometric consistency, and a learnable cross-modal reparameterization transform to enable robust multi-view feature alignment and end-to-end instance segmentation. Evaluated on two indoor radar benchmarks, RADAR-DETR achieves +15.38 AP in object detection and +11.91 IoU in instance segmentation over prior state-of-the-art methods. This work marks the first successful adaptation of the DETR paradigm to multi-view radar perception, establishing a novel radar–vision fusion paradigm grounded in geometric and signal-aware representation learning.

Technology Category

Application Category

📝 Abstract
Indoor radar perception has seen rising interest due to affordable costs driven by emerging automotive imaging radar developments and the benefits of reduced privacy concerns and reliability under hazardous conditions (e.g., fire and smoke). However, existing radar perception pipelines fail to account for distinctive characteristics of the multi-view radar setting. In this paper, we propose Radar dEtection TRansformer (RETR), an extension of the popular DETR architecture, tailored for multi-view radar perception. RETR inherits the advantages of DETR, eliminating the need for hand-crafted components for object detection and segmentation in the image plane. More importantly, RETR incorporates carefully designed modifications such as 1) depth-prioritized feature similarity via a tunable positional encoding (TPE); 2) a tri-plane loss from both radar and camera coordinates; and 3) a learnable radar-to-camera transformation via reparameterization, to account for the unique multi-view radar setting. Evaluated on two indoor radar perception datasets, our approach outperforms existing state-of-the-art methods by a margin of 15.38+ AP for object detection and 11.91+ IoU for instance segmentation, respectively. Our implementation is available at https://github.com/merlresearch/radar-detection-transformer.
Problem

Research questions and friction points this paper is trying to address.

Indoor Radar Systems
Multi-Angle Radar Detectors
Object Recognition Accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

RETR
multi-angle radar
transformer-based system
🔎 Similar Papers
No similar papers found.
Ryoma Yataka
Ryoma Yataka
Mitsubishi Electric Corporation
computer visionradar perceptiongeometric deep learningmachine learning
Adriano Cardace
Adriano Cardace
Department of Computer Science and Engineering, University of Bologna, Italy
P
Pu Wang
Mitsubishi Electric Research Laboratories (MERL), USA
P
P. Boufounos
Mitsubishi Electric Research Laboratories (MERL), USA
R
Ryuhei Takahashi
Information Technology R&D Center (ITC), Mitsubishi Electric Corporation, Japan