SDTrack: A Baseline for Event-based Tracking via Spiking Neural Networks

📅 2025-03-09

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

To address the low energy efficiency and limited performance of ANN-SNN hybrid approaches in event-camera-based object tracking, this paper proposes Spiking MetaFormer—the first fully spike-driven Transformer architecture. Its core innovation is the Global Trajectory Prompt (GTP) mechanism, which deeply fuses event streams with motion priors directly in the spike domain, enabling end-to-end tracking without data augmentation or post-processing. The method employs event-image encoding and spike-domain coordinate regression to establish a lightweight, low-power, high-performance spiking neural network (SNN) tracking baseline. Evaluated on multiple event-based tracking benchmarks, Spiking MetaFormer achieves state-of-the-art (SOTA) accuracy while attaining the lowest parameter count and energy consumption among existing methods. This work establishes a new paradigm and a new foundational baseline for neuromorphic vision-based tracking.

Technology Category

Application Category

📝 Abstract

Event cameras provide superior temporal resolution, dynamic range, power efficiency, and pixel bandwidth. Spiking Neural Networks (SNNs) naturally complement event data through discrete spike signals, making them ideal for event-based tracking. However, current approaches that combine Artificial Neural Networks (ANNs) and SNNs, along with suboptimal architectures, compromise energy efficiency and limit tracking performance. To address these limitations, we propose the first Transformer-based spike-driven tracking pipeline. Our Global Trajectory Prompt (GTP) method effectively captures global trajectory information and aggregates it with event streams into event images to enhance spatiotemporal representation. We then introduce SDTrack, a Transformer-based spike-driven tracker comprising a Spiking MetaFormer backbone and a simple tracking head that directly predicts normalized coordinates using spike signals. The framework is end-to-end, does not require data augmentation or post-processing. Extensive experiments demonstrate that SDTrack achieves state-of-the-art performance while maintaining the lowest parameter count and energy consumption across multiple event-based tracking benchmarks, establishing a solid baseline for future research in the field of neuromorphic vision.

Problem

Research questions and friction points this paper is trying to address.

Combining ANNs and SNNs compromises energy efficiency and tracking performance.

Transformer-based spike-driven tracking pipeline enhances spatiotemporal representation.

SDTrack achieves state-of-the-art performance with low energy consumption.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based spike-driven tracking pipeline

Global Trajectory Prompt enhances spatiotemporal representation

SDTrack: Spiking MetaFormer backbone with low energy

🔎 Similar Papers

No similar papers found.