SpikePool: Event-driven Spiking Transformer with Pooling Attention

📅 2025-10-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current spiking Transformers lack a principled understanding of the intrinsic signal characteristics of event data, focusing predominantly on architectural modifications while neglecting underlying mechanistic principles. This work is the first to reveal—through a frequency-domain analysis—that spiking Transformers inherently behave as high-pass filters, leading to insufficient extraction of critical mid-frequency features from sparse and noisy event streams. To address this, we propose SpikePool: a max-pooling–based attention mechanism that implements selective band-pass filtering—preserving informative high-frequency dynamics while suppressing high-frequency noise. SpikePool bridges theoretical interpretability and engineering efficacy. It achieves state-of-the-art performance on event-camera classification and detection benchmarks, while reducing training and inference time by 42.5% and 32.8%, respectively. The method significantly enhances the robustness and computational efficiency of spiking Transformers.

Technology Category

Application Category

📝 Abstract
Building on the success of transformers, Spiking Neural Networks (SNNs) have increasingly been integrated with transformer architectures, leading to spiking transformers that demonstrate promising performance on event-based vision tasks. However, despite these empirical successes, there remains limited understanding of how spiking transformers fundamentally process event-based data. Current approaches primarily focus on architectural modifications without analyzing the underlying signal processing characteristics. In this work, we analyze spiking transformers through the frequency spectrum domain and discover that they behave as high-pass filters, contrasting with Vision Transformers (ViTs) that act as low-pass filters. This frequency domain analysis reveals why certain designs work well for event-based data, which contains valuable high-frequency information but is also sparse and noisy. Based on this observation, we propose SpikePool, which replaces spike-based self-attention with max pooling attention, a low-pass filtering operation, to create a selective band-pass filtering effect. This design preserves meaningful high-frequency content while capturing critical features and suppressing noise, achieving a better balance for event-based data processing. Our approach demonstrates competitive results on event-based datasets for both classification and object detection tasks while significantly reducing training and inference time by up to 42.5% and 32.8%, respectively.
Problem

Research questions and friction points this paper is trying to address.

Analyzes spiking transformers' frequency domain behavior on event data
Proposes pooling attention to balance high-frequency preservation and noise suppression
Improves event-based vision task performance while reducing computational costs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Replaces spike-based self-attention with max pooling
Creates selective band-pass filtering for event data
Reduces training and inference time significantly
🔎 Similar Papers
No similar papers found.
D
Donghyun Lee
Department of Electrical Engineering, Yale University
A
Alex Sima
Department of Computer Science, Yale University
Yuhang Li
Yuhang Li
Yale University
Machine Learning
Panos Stinis
Panos Stinis
Pacific Northwest National Laboratory
Scientific computing
P
Priyadarshini Panda
Department of Electrical Engineering, Yale University