🤖 AI Summary
Event-based optical flow estimation faces significant challenges—including highly non-convex optimization, poor convergence, and low robustness—due to the spatiotemporal sparsity of event streams, which undermines contrast maximization (CM)-based approaches. To address this, we propose a biologically inspired vision-inertial fusion method that uniquely leverages inertial measurement unit (IMU)-derived 3D velocity estimates to construct directional prior maps, which are then embedded as structured geometric constraints into the CM framework. This integration effectively narrows the motion search space under sparse event conditions, thereby enhancing both optimization stability and flow accuracy. The method is fully self-supervised, end-to-end trainable, and requires no additional ground-truth annotations. Extensive experiments on MVSEC, DSEC, and ECD benchmarks demonstrate consistent and substantial improvements over state-of-the-art methods, establishing a novel paradigm for event-based optical flow estimation.
📝 Abstract
Event cameras, by virtue of their working principle, directly encode motion within a scene. Many learning-based and model-based methods exist that estimate event-based optical flow, however the temporally dense yet spatially sparse nature of events poses significant challenges. To address these issues, contrast maximization (CM) is a prominent model-based optimization methodology that estimates the motion trajectories of events within an event volume by optimally warping them. Since its introduction, the CM framework has undergone a series of refinements by the computer vision community. Nonetheless, it remains a highly non-convex optimization problem. In this paper, we introduce a novel biologically-inspired hybrid CM method for event-based optical flow estimation that couples visual and inertial motion cues. Concretely, we propose the use of orientation maps, derived from camera 3D velocities, as priors to guide the CM process. The orientation maps provide directional guidance and constrain the space of estimated motion trajectories. We show that this orientation-guided formulation leads to improved robustness and convergence in event-based optical flow estimation. The evaluation of our approach on the MVSEC, DSEC, and ECD datasets yields superior accuracy scores over the state of the art.