🤖 AI Summary
Event camera data exhibits spatial sparsity and temporal density, leading to severe undersampling in conventional frame- or voxel-based representations. To address this, we propose a hypergraph-guided spatiotemporal event stream completion framework: (1) a cross-spatiotemporal hypergraph is constructed to connect sparse event tokens, while RGB tokens are fused to enable multimodal collaborative completion; (2) a joint architecture integrating hypergraph neural networks and self-attention is designed to support dynamic message passing and multi-timestep feature aggregation. This work is the first to introduce hypergraph structures into event stream modeling, effectively mitigating spatial undersampling and enabling end-to-end joint completion and feature learning of RGB and event streams. Extensive experiments on both single-label and multi-label event classification tasks achieve state-of-the-art performance, demonstrating the efficacy of our approach in event completion and multimodal fusion.
📝 Abstract
Event cameras produce asynchronous event streams that are spatially sparse yet temporally dense. Mainstream event representation learning algorithms typically use event frames, voxels, or tensors as input. Although these approaches have achieved notable progress, they struggle to address the undersampling problem caused by spatial sparsity. In this paper, we propose a novel hypergraph-guided spatio-temporal event stream completion mechanism, which connects event tokens across different times and spatial locations via hypergraphs and leverages contextual information message passing to complete these sparse events. The proposed method can flexibly incorporate RGB tokens as nodes in the hypergraph within this completion framework, enabling multi-modal hypergraph-based information completion. Subsequently, we aggregate hypergraph node information across different time steps through self-attention, enabling effective learning and fusion of multi-modal features. Extensive experiments on both single- and multi-label event classification tasks fully validated the effectiveness of our proposed framework. The source code of this paper will be released on https://github.com/Event-AHU/EvRainDrop.