LAS: Loss-less ANN-SNN Conversion for Fully Spike-Driven Large Language Models

📅 2025-05-14

📈 Citations: 0

✨ Influential: 0

career value

156K/year

🤖 AI Summary

Existing ANN-to-SNN conversion methods face two critical bottlenecks when applied to large language models (LLMs) and vision-language models (VLMs): (1) extreme activation outliers trigger spiking explosions, and (2) non-linearities such as ReLU and GELU are incompatible with spiking neural networks (SNNs), impeding lossless conversion. This work introduces a novel spiking neuron with adaptive threshold normalization and spike-equivalent Transformer components—including spike-based attention and an SNN-friendly, reparameterized feed-forward network (FFN). Together, these enable the first fully spiking, end-to-end, lossless ANN-to-SNN conversion. Evaluated on six LLMs and two VLMs, the method achieves zero accuracy degradation; notably, OPT-66B attains a 2% absolute improvement in accuracy on the Winograd Schema Challenge (WSC). The implementation is publicly available.

Technology Category

Application Category

📝 Abstract

Spiking Large Language Models (LLMs) have emerged as an energy-efficient alternative to conventional LLMs through their event-driven computation. To effectively obtain spiking LLMs, researchers develop different ANN-to-SNN conversion methods by leveraging pre-trained ANN parameters while inheriting the energy efficiency of SNN. However, existing conversion methods struggle with extreme activation outliers and incompatible nonlinear operations of ANN-based LLMs. To address this, we propose a loss-less ANN-SNN conversion for fully spike-driven LLMs, termed LAS. Specifically, LAS introduces two novel neurons to convert the activation outlier and nonlinear operation of ANN-based LLMs. Moreover, LAS tailors the spike-equivalent Transformer components for spiking LLMs, which can ensure full spiking conversion without any loss of performance. Experimental results on six language models and two vision-language models demonstrate that LAS achieves loss-less conversion. Notably, on OPT-66B, LAS even improves the accuracy of 2% on the WSC task. In addition, the parameter and ablation studies further verify the effectiveness of LAS. The source code is available at https://github.com/lc783/LAS

Problem

Research questions and friction points this paper is trying to address.

Address activation outliers in ANN-SNN conversion

Resolve incompatible nonlinear operations in spiking LLMs

Ensure loss-less performance in fully spike-driven conversion

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces novel neurons for outlier conversion

Tailors spike-equivalent Transformer components

Ensures loss-less full spiking conversion

🔎 Similar Papers

Inference-Scale Complexity in ANN-SNN Conversion for High-Performance and Low-Power Applications