Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba

๐Ÿ“… 2024-05-09
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 5
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Event cameras produce sparse, high-temporal-resolution event streams that are poorly modeled by existing frame-based or point-cloud approaches: the former compromises temporal fidelity and incurs redundant computation, while the latter suffers from limited performance due to neglecting both explicit and implicit temporal dynamics. To address this, we propose EventMambaโ€”the first framework to integrate state-space models (SSMs), specifically Mamba, into event point cloud processing. It comprises a hierarchical point cloud network and a redesigned global temporal aggregation module that explicitly encodes event timestamps and implicitly captures long-range temporal dependencies. Crucially, EventMamba operates directly on raw event clouds, eliminating frame-based discretization and enabling native spatiotemporal modeling. Evaluated on six action recognition benchmarks, it achieves state-of-the-art performance among point-cloud methods. Moreover, it consistently outperforms frame-based approaches on pose relocalization and eye-movement regression tasks, while significantly reducing computational overhead.

Technology Category

Application Category

๐Ÿ“ Abstract
Event cameras draw inspiration from biological systems, boasting low latency and high dynamic range while consuming minimal power. The most current approach to processing Event Cloud often involves converting it into frame-based representations, which neglects the sparsity of events, loses fine-grained temporal information, and increases the computational burden. In contrast, Point Cloud is a popular representation for processing 3-dimensional data and serves as an alternative method to exploit local and global spatial features. Nevertheless, previous point-based methods show an unsatisfactory performance compared to the frame-based method in dealing with spatio-temporal event streams. In order to bridge the gap, we propose EventMamba, an efficient and effective framework based on Point Cloud representation by rethinking the distinction between Event Cloud and Point Cloud, emphasizing vital temporal information. The Event Cloud is subsequently fed into a hierarchical structure with staged modules to process both implicit and explicit temporal features. Specifically, we redesign the global extractor to enhance explicit temporal extraction among a long sequence of events with temporal aggregation and State Space Model (SSM) based Mamba. Our model consumes minimal computational resources in the experiments and still exhibits SOTA point-based performance on six different scales of action recognition datasets. It even outperformed all frame-based methods on both Camera Pose Relocalization (CPR) and eye-tracking regression tasks. Our code is available at: https://github.com/rhwxmx/EventMamba.
Problem

Research questions and friction points this paper is trying to address.

Event cameras lose temporal data in frame conversions
Point Cloud methods underperform in spatio-temporal event streams
Need efficient temporal feature extraction for Event Clouds
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Point Cloud for Event Camera data
Integrates temporal aggregation with Mamba SSM
Hierarchical structure processes temporal features
๐Ÿ”Ž Similar Papers
No similar papers found.
H
Hongwei Ren
MICS Thrust at the Hong Kong University of Science and Technology (Guangzhou)
Y
Yue Zhou
MICS Thrust at the Hong Kong University of Science and Technology (Guangzhou)
J
Jiadong Zhu
MICS Thrust at the Hong Kong University of Science and Technology (Guangzhou)
Haotian Fu
Haotian Fu
Brown University
Yulong Huang
Yulong Huang
MICS Thrust at the Hong Kong University of Science and Technology (Guangzhou)
X
Xiaopeng Lin
MICS Thrust at the Hong Kong University of Science and Technology (Guangzhou)
Yuetong Fang
Yuetong Fang
Ph.D. Student, HKUST(GZ)
Brain-inspired computingNeuromorphic ComputingEmbodied AI
F
Fei Ma
Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ)
H
Hao Yu
School of Microelectronics at Southern University of Science and Technology
B
Bo-Xun Cheng
MICS Thrust at the Hong Kong University of Science and Technology (Guangzhou)