SDTrack: A Baseline for Event-based Tracking via Spiking Neural Networks

๐Ÿ“… 2025-03-09
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the low energy efficiency and limited performance of ANN-SNN hybrid approaches in event-camera-based object tracking, this paper proposes Spiking MetaFormerโ€”the first fully spike-driven Transformer architecture. Its core innovation is the Global Trajectory Prompt (GTP) mechanism, which deeply fuses event streams with motion priors directly in the spike domain, enabling end-to-end tracking without data augmentation or post-processing. The method employs event-image encoding and spike-domain coordinate regression to establish a lightweight, low-power, high-performance spiking neural network (SNN) tracking baseline. Evaluated on multiple event-based tracking benchmarks, Spiking MetaFormer achieves state-of-the-art (SOTA) accuracy while attaining the lowest parameter count and energy consumption among existing methods. This work establishes a new paradigm and a new foundational baseline for neuromorphic vision-based tracking.

Technology Category

Application Category

๐Ÿ“ Abstract
Event cameras provide superior temporal resolution, dynamic range, power efficiency, and pixel bandwidth. Spiking Neural Networks (SNNs) naturally complement event data through discrete spike signals, making them ideal for event-based tracking. However, current approaches that combine Artificial Neural Networks (ANNs) and SNNs, along with suboptimal architectures, compromise energy efficiency and limit tracking performance. To address these limitations, we propose the first Transformer-based spike-driven tracking pipeline. Our Global Trajectory Prompt (GTP) method effectively captures global trajectory information and aggregates it with event streams into event images to enhance spatiotemporal representation. We then introduce SDTrack, a Transformer-based spike-driven tracker comprising a Spiking MetaFormer backbone and a simple tracking head that directly predicts normalized coordinates using spike signals. The framework is end-to-end, does not require data augmentation or post-processing. Extensive experiments demonstrate that SDTrack achieves state-of-the-art performance while maintaining the lowest parameter count and energy consumption across multiple event-based tracking benchmarks, establishing a solid baseline for future research in the field of neuromorphic vision.
Problem

Research questions and friction points this paper is trying to address.

Combining ANNs and SNNs compromises energy efficiency and tracking performance.
Transformer-based spike-driven tracking pipeline enhances spatiotemporal representation.
SDTrack achieves state-of-the-art performance with low energy consumption.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based spike-driven tracking pipeline
Global Trajectory Prompt enhances spatiotemporal representation
SDTrack: Spiking MetaFormer backbone with low energy
๐Ÿ”Ž Similar Papers
No similar papers found.
Yimeng Shan
Yimeng Shan
Liaoning technical university
Spiking Neural NetworksNeuromorphic VisionSingle Object TrackingEvent Camera
Z
Zhenbang Ren
University of Electronic Science and Technology of China, China
H
Haodi Wu
University of Electronic Science and Technology of China, China
Wenjie Wei
Wenjie Wei
University of Electronic Science and Technology of China
Spiking Neural NetworkNeuromorphic ComputingModel CompressionEvent-based Vision
Rui-Jie Zhu
Rui-Jie Zhu
Ph.D. Student, University of California, Santa Cruz
Brain-Inspired EngineeringLanguage Modeling
S
Shuai Wang
University of Electronic Science and Technology of China, China
Dehao Zhang
Dehao Zhang
University of Electronic Science and Technology of China
Spiking Neural Network
Y
Yichen Xiao
University of Electronic Science and Technology of China, China
J
Jieyuan Zhang
University of Electronic Science and Technology of China, China
K
Kexin Shi
University of Electronic Science and Technology of China, China
J
Jingzhinan Wang
University of Electronic Science and Technology of China, China
Jason K. Eshraghian
Jason K. Eshraghian
University of California, Santa Cruz, Assistant Professor
lightweight machine learningneuromorphic computingspiking neural networks
H
Haicheng Qu
Liaoning Technical University, China
Jiqing Zhang
Jiqing Zhang
Dalian Maritime University, China
M
Malu Zhang
University of Electronic Science and Technology of China, China
Y
Yang Yang
University of Electronic Science and Technology of China, China