SpikeSMOKE: Spiking Neural Networks for Monocular 3D Object Detection with Cross-Scale Gated Coding

📅 2025-06-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address information loss and limited representational capacity in spiking neural networks (SNNs) for monocular 3D object detection—caused by discrete spike events—and the excessive energy consumption of conventional artificial neural networks (ANNs), this paper proposes SpikeSMOKE, a lightweight and efficient SNN architecture. Its key contributions are: (1) a Cross-Scale Gated Coding (CSGC) mechanism that integrates attention-guided multi-scale features with biologically inspired gated filtering; and (2) a spike-compatible lightweight residual module that enhances feature expressiveness and training stability. Evaluated on the KITTI benchmark, SpikeSMOKE achieves an average 3% improvement in 3D AP across Easy/Moderate/Hard difficulty levels. For the Hard subset, it reduces energy consumption by 72.2% compared to SMOKE, while decreasing parameter count and computational cost by 3× and 10×, respectively. These advances significantly advance the practical deployment of SNNs for edge-based 3D perception.

Technology Category

Application Category

📝 Abstract
Low energy consumption for 3D object detection is an important research area because of the increasing energy consumption with their wide application in fields such as autonomous driving. The spiking neural networks (SNNs) with low-power consumption characteristics can provide a novel solution for this research. Therefore, we apply SNNs to monocular 3D object detection and propose the SpikeSMOKE architecture in this paper, which is a new attempt for low-power monocular 3D object detection. As we all know, discrete signals of SNNs will generate information loss and limit their feature expression ability compared with the artificial neural networks (ANNs).In order to address this issue, inspired by the filtering mechanism of biological neuronal synapses, we propose a cross-scale gated coding mechanism(CSGC), which can enhance feature representation by combining cross-scale fusion of attentional methods and gated filtering mechanisms.In addition, to reduce the computation and increase the speed of training, we present a novel light-weight residual block that can maintain spiking computing paradigm and the highest possible detection performance. Compared to the baseline SpikeSMOKE under the 3D Object Detection, the proposed SpikeSMOKE with CSGC can achieve 11.78 (+2.82, Easy), 10.69 (+3.2, Moderate), and 10.48 (+3.17, Hard) on the KITTI autonomous driving dataset by AP|R11 at 0.7 IoU threshold, respectively. It is important to note that the results of SpikeSMOKE can significantly reduce energy consumption compared to the results on SMOKE. For example,the energy consumption can be reduced by 72.2% on the hard category, while the detection performance is reduced by only 4%. SpikeSMOKE-L (lightweight) can further reduce the amount of parameters by 3 times and computation by 10 times compared to SMOKE.
Problem

Research questions and friction points this paper is trying to address.

Low-energy 3D object detection using spiking neural networks
Enhancing feature representation with cross-scale gated coding
Reducing computation while maintaining detection performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spiking Neural Networks for 3D detection
Cross-scale gated coding enhances features
Lightweight residual block reduces computation
🔎 Similar Papers
No similar papers found.