An End-to-End DNN Inference Framework for the SpiNNaker2 Neuromorphic MPSoC

📅 2025-07-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of deploying large, complex deep neural networks (DNNs)—notably Transformers—on the SpiNNaker2 neuromorphic MPSoC for efficient inference. We present the first end-to-end PyTorch-to-SpiNNaker2 inference framework, built upon an extended OctopusScheduler that integrates multi-layer DNN scheduling, fine-grained model quantization, operator downgrading, and neuromorphic hardware mapping. Our framework enables full compilation and execution of PyTorch models directly onto a single SpiNNaker2 chip. To our knowledge, this is the first demonstration of end-to-end Transformer-scale DNN inference on SpiNNaker2, achieving high accuracy while significantly improving energy efficiency and throughput. The work overcomes a key bottleneck in neuromorphic hardware support for large-scale DNN inference and provides a scalable, system-level solution for deploying complex AI models at the edge.

Technology Category

Application Category

📝 Abstract
This work presents a multi-layer DNN scheduling framework as an extension of OctopuScheduler, providing an end-to-end flow from PyTorch models to inference on a single SpiNNaker2 chip. Together with a front-end comprised of quantization and lowering steps, the proposed framework enables the edge-based execution of large and complex DNNs up to transformer scale using the neuromorphic platform SpiNNaker2.
Problem

Research questions and friction points this paper is trying to address.

Develop end-to-end DNN inference for SpiNNaker2 neuromorphic chip
Enable edge-based execution of large-scale DNNs
Extend OctopuScheduler with PyTorch-to-SpiNNaker2 flow
Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end DNN scheduling framework for SpiNNaker2
PyTorch to neuromorphic chip inference flow
Supports transformer-scale DNNs on edge
🔎 Similar Papers
No similar papers found.