bitSMM: A bit-Serial Matrix Multiplication Accelerator

📅 2026-03-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of deploying high-performance neural network accelerators on spacecraft, where stringent power and reliability constraints limit conventional solutions. To overcome this, the authors propose a systolic-array-based bit-serial matrix multiplication accelerator supporting runtime-configurable precision from 1 to 16 bits. Two efficient multiply-accumulate (MAC) architectures are designed to jointly optimize energy and area efficiency. Implemented in SystemVerilog, the accelerator achieves 19.2 GOPS throughput and 2.973 GOPS/W energy efficiency on an AMD ZCU104 FPGA. When synthesized in the ASAP7 standard-cell library, it delivers 73.22 GOPS performance, 552 GOPS/mm² area efficiency, and 40.8 GOPS/W energy efficiency—significantly outperforming existing approaches.

Technology Category

Application Category

📝 Abstract
Neural-network (NN) inference is increasingly present on-board spacecraft to reduce downlink bandwidth and enable timely decision making. However, the power and reliability constraints of space missions limit the applicability of many state-of-the-art NN accelerators. This paper presents bitSMM, a bit-serial matrix multiplication accelerator built around a systolic array of bit-serial multiply--accumulate (MAC) units. The design supports runtime-configurable operand precision from 1 to 16 bits and evaluates two MAC variants: a Booth-inspired architecture and a standard binary multiplication with correction architecture. We implement bitSMM in [System]Verilog and evaluate it on an AMD ZCU104 FPGA and through ASIC physical implementation using the asap7 and nangate45 process design kits. On the FPGA, bitSMM achieves up to 19.2~GOPS and 2.973~GOPS/W, and in asap7 it achieves up to 73.22~GOPS, 552~GOPS/mm$^2$, and 40.8~GOPS/W.
Problem

Research questions and friction points this paper is trying to address.

neural-network inference
spacecraft
power constraints
reliability constraints
NN accelerators
Innovation

Methods, ideas, or system contributions that make the work stand out.

bit-serial
systolic array
configurable precision
neural network accelerator
space-efficient computing
🔎 Similar Papers
No similar papers found.