Training-Free ANN-to-SNN Conversion for High-Performance Spiking Transformer

📅 2025-08-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing ANN-to-SNN conversion methods struggle to handle the complex nonlinear operations in Transformers and rely heavily on post-training fine-tuning, hindering efficient deployment of spiking Transformers. This paper proposes a training-free ANN-to-SNN conversion framework. We introduce Multi-Basis Exponential (MBE) spiking neurons that accurately approximate key nonlinearities—including Softmax, LayerNorm, and GeLU—without weight updates or architectural modifications, enabling plug-and-play integration. Coupled with an event-driven execution paradigm and a Multi-basis encoding strategy, our framework supports mainstream architectures such as ViT, RoBERTa, and GPT-2. Evaluated across computer vision (CV), natural language understanding (NLU), and natural language generation (NLG) tasks, it achieves near-lossless accuracy (average degradation <0.5%), reduces inference latency by 3.2–5.8×, and significantly improves energy efficiency. Our approach establishes a novel paradigm for scalable, high-performance spiking Transformer deployment.

Technology Category

Application Category

📝 Abstract
Leveraging the event-driven paradigm, Spiking Neural Networks (SNNs) offer a promising approach for constructing energy-efficient Transformer architectures. Compared to directly trained Spiking Transformers, ANN-to-SNN conversion methods bypass the high training costs. However, existing methods still suffer from notable limitations, failing to effectively handle nonlinear operations in Transformer architectures and requiring additional fine-tuning processes for pre-trained ANNs. To address these issues, we propose a high-performance and training-free ANN-to-SNN conversion framework tailored for Transformer architectures. Specifically, we introduce a Multi-basis Exponential Decay (MBE) neuron, which employs an exponential decay strategy and multi-basis encoding method to efficiently approximate various nonlinear operations. It removes the requirement for weight modifications in pre-trained ANNs. Extensive experiments across diverse tasks (CV, NLU, NLG) and mainstream Transformer architectures (ViT, RoBERTa, GPT-2) demonstrate that our method achieves near-lossless conversion accuracy with significantly lower latency. This provides a promising pathway for the efficient and scalable deployment of Spiking Transformers in real-world applications.
Problem

Research questions and friction points this paper is trying to address.

Handling nonlinear operations in Spiking Transformers
Eliminating fine-tuning for pre-trained ANNs
Achieving near-lossless conversion with low latency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free ANN-to-SNN conversion framework
Multi-basis Exponential Decay neuron
Near-lossless conversion with low latency
🔎 Similar Papers
No similar papers found.
Jingya Wang
Jingya Wang
Assistant Professor, ShanghaiTech University
Computer VisionEmbodied AIHuman-Object Interaction
X
Xin Deng
Department of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China
Wenjie Wei
Wenjie Wei
University of Electronic Science and Technology of China
Spiking Neural NetworkNeuromorphic ComputingModel CompressionEvent-based Vision
Dehao Zhang
Dehao Zhang
University of Electronic Science and Technology of China
Spiking Neural Network
S
Shuai Wang
Department of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China
Q
Qian Sun
Department of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China
J
Jieyuan Zhang
Department of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China
H
Hanwen Liu
Department of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China
N
Ning Xie
Department of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China
M
Malu Zhang
Department of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China