Ev-3DOD: Pushing the Temporal Boundaries of 3D Object Detection with Event Cameras

πŸ“… 2025-02-26
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the high latency, low temporal resolution, and bandwidth bottlenecks of conventional frame-based sensors (LiDAR/RGB cameras) in 3D object detection, this work pioneers the integration of asynchronous event cameras into 3D detection frameworks, proposing an event-driven continuous 3D detection paradigm. Methodologically, we design a spatiotemporal encoding and alignment mechanism for event streams; develop a lightweight cross-modal (event + LiDAR) 3D feature extraction and fusion architecture; and introduce a synchronization-gap-free continuous inference mechanism across event sequences. Key contributions include: (i) releasing DSEC-3DODβ€”the first large-scale event-based dataset with 100 FPS ground-truth 3D bounding boxes; and (ii) achieving a 67% average latency reduction (to millisecond scale) on DSEC-3DOD, while attaining state-of-the-art mAP comparable to leading multimodal methods and enabling frame-rate-unconstrained real-time, continuous 3D perception.

Technology Category

Application Category

πŸ“ Abstract
Detecting 3D objects in point clouds plays a crucial role in autonomous driving systems. Recently, advanced multi-modal methods incorporating camera information have achieved notable performance. For a safe and effective autonomous driving system, algorithms that excel not only in accuracy but also in speed and low latency are essential. However, existing algorithms fail to meet these requirements due to the latency and bandwidth limitations of fixed frame rate sensors, e.g., LiDAR and camera. To address this limitation, we introduce asynchronous event cameras into 3D object detection for the first time. We leverage their high temporal resolution and low bandwidth to enable high-speed 3D object detection. Our method enables detection even during inter-frame intervals when synchronized data is unavailable, by retrieving previous 3D information through the event camera. Furthermore, we introduce the first event-based 3D object detection dataset, DSEC-3DOD, which includes ground-truth 3D bounding boxes at 100 FPS, establishing the first benchmark for event-based 3D detectors. The code and dataset are available at https://github.com/mickeykang16/Ev3DOD.
Problem

Research questions and friction points this paper is trying to address.

Enhancing 3D object detection speed
Reducing latency in autonomous systems
Utilizing event cameras for continuous detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses event cameras detection
High temporal resolution
First event-based dataset
πŸ”Ž Similar Papers
2024-09-26arXiv.orgCitations: 3