🤖 AI Summary
To address challenges in long-sequence event-based person re-identification (Re-ID) under adverse illumination (e.g., strong backlight or low light)—including severe background clutter, difficulty in temporal modeling, and vulnerability of facial privacy—this paper proposes a robust identity matching framework based on spiking neural networks (SNNs). Our method introduces two key innovations: (1) a novel spike-guided spatiotemporal semantic coupling mechanism (SSAM) that explicitly models the asynchronous temporal dynamics and spatial semantic correlations inherent in event streams; and (2) a parameter-free dynamic spatiotemporal subsequence sampling strategy (STFS), enhancing semantic coverage and robustness against interference without adding parameters. The framework achieves high temporal resolution while preserving facial privacy. Evaluated on multiple mainstream long-sequence event-based Re-ID benchmarks, it attains state-of-the-art performance with significantly reduced parameter count and superior inference efficiency.
📝 Abstract
In this paper, we leverage the advantages of event cameras to resist harsh lighting conditions, reduce background interference, achieve high time resolution, and protect facial information to study the long-sequence event-based person re-identification (Re-ID) task. To this end, we propose a simple and efficient long-sequence event Re-ID model, namely the Spike-guided Spatiotemporal Semantic Coupling and Expansion Network (S3CE-Net). To better handle asynchronous event data, we build S3CE-Net based on spiking neural networks (SNNs). The S3CE-Net incorporates the Spike-guided Spatial-temporal Attention Mechanism (SSAM) and the Spatiotemporal Feature Sampling Strategy (STFS). The SSAM is designed to carry out semantic interaction and association in both spatial and temporal dimensions, leveraging the capabilities of SNNs. The STFS involves sampling spatial feature subsequences and temporal feature subsequences from the spatiotemporal dimensions, driving the Re-ID model to perceive broader and more robust effective semantics. Notably, the STFS introduces no additional parameters and is only utilized during the training stage. Therefore, S3CE-Net is a low-parameter and high-efficiency model for long-sequence event-based person Re-ID. Extensive experiments have verified that our S3CE-Net achieves outstanding performance on many mainstream long-sequence event-based person Re-ID datasets. Code is available at:https://github.com/Mhsunshine/SC3E_Net.