π€ AI Summary
To address the high computational cost of deep learning models (CNNs/RNNs/ViTs) and the weak spatiotemporal modeling capability of asynchronous models (SNNs/GNNs) in event-camera optical flow estimation, this paper proposes STSSMβthe first lightweight spatiotemporal module integrating State Space Models (SSMs). STSSM processes event streams via an event-driven architecture synergized with an efficient SSM kernel to capture long-range spatiotemporal dependencies, preserving the low-latency advantage of asynchronous processing while significantly enhancing representational capacity. Evaluated on the DSEC benchmark, our method achieves accuracy comparable to EV-FlowNet, with a 4.5Γ speedup in inference latency and an 8Γ reduction in computational cost compared to TMA. STSSM thus establishes a new state-of-the-art trade-off between accuracy and efficiency for event-based optical flow estimation.
π Abstract
Event cameras unlock new frontiers that were previously unthinkable with standard frame-based cameras. One notable example is low-latency motion estimation (optical flow), which is critical for many real-time applications. In such applications, the computational efficiency of algorithms is paramount. Although recent deep learning paradigms such as CNN, RNN, or ViT have shown remarkable performance, they often lack the desired computational efficiency. Conversely, asynchronous event-based methods including SNNs and GNNs are computationally efficient; however, these approaches fail to capture sufficient spatio-temporal information, a powerful feature required to achieve better performance for optical flow estimation. In this work, we introduce Spatio-Temporal State Space Model (STSSM) module along with a novel network architecture to develop an extremely efficient solution with competitive performance. Our STSSM module leverages state-space models to effectively capture spatio-temporal correlations in event data, offering higher performance with lower complexity compared to ViT, CNN-based architectures in similar settings. Our model achieves 4.5x faster inference and 8x lower computations compared to TMA and 2x lower computations compared to EV-FlowNet with competitive performance on the DSEC benchmark. Our code will be available at https://github.com/AhmedHumais/E-STMFlow