Unleashing the Temporal Potential of Stereo Event Cameras for Continuous-Time 3D Object Detection

📅 2025-08-04

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

To address perception gaps and motion blur caused by fixed frame rates in conventional LiDAR/RGB cameras under high-speed dynamic scenarios, this paper proposes the first continuous-time 3D object detection framework based on stereo event cameras. Methodologically, we design a dual-branch filtering network to jointly extract semantic and geometric features from asynchronous event streams, and introduce a spatiotemporal alignment mechanism along with a center-aligned regression optimization strategy, enabling end-to-end, purely event-driven 3D detection. Our key contributions are: (i) the first demonstration of continuous-time modeling and 3D localization using only binocular event data—bypassing frame-rate limitations entirely; and (ii) a novel dual-filtering mechanism and center-aligned regression that significantly enhance detection accuracy and robustness for fast-moving objects. Experiments on multiple high-speed dynamic sequences show substantial improvements over state-of-the-art event-based and frame-based 3D detection methods.

Technology Category

Application Category

📝 Abstract

3D object detection is essential for autonomous systems, enabling precise localization and dimension estimation. While LiDAR and RGB cameras are widely used, their fixed frame rates create perception gaps in high-speed scenarios. Event cameras, with their asynchronous nature and high temporal resolution, offer a solution by capturing motion continuously. The recent approach, which integrates event cameras with conventional sensors for continuous-time detection, struggles in fast-motion scenarios due to its dependency on synchronized sensors. We propose a novel stereo 3D object detection framework that relies solely on event cameras, eliminating the need for conventional 3D sensors. To compensate for the lack of semantic and geometric information in event data, we introduce a dual filter mechanism that extracts both. Additionally, we enhance regression by aligning bounding boxes with object-centric information. Experiments show that our method outperforms prior approaches in dynamic environments, demonstrating the potential of event cameras for robust, continuous-time 3D perception. The code is available at https://github.com/mickeykang16/Ev-Stereo3D.

Problem

Research questions and friction points this paper is trying to address.

Enables continuous-time 3D object detection using stereo event cameras

Overcomes perception gaps in high-speed scenarios with asynchronous event data

Compensates for lack of semantic and geometric info in event streams

Innovation

Methods, ideas, or system contributions that make the work stand out.

Stereo event cameras for continuous 3D detection

Dual filter extracts semantic and geometric data

Object-centric bounding box alignment enhances regression

🔎 Similar Papers

Event-based Stereo Depth Estimation: A Survey