🤖 AI Summary
This work proposes a high-accuracy, lightweight object detection framework tailored for low-cost FPGA platforms. Building upon the YOLOv3-tiny architecture, the design employs a binary neural network (BNN) featuring 1-bit weights and 8-bit activations, while retaining fixed-point convolutions in the first and last layers. A novel channel compensation fusion strategy is introduced, integrating the Mul_prev operation directly into the BNN processing unit to enhance channel-wise compensation efficiency. The system is fully implemented in Verilog RTL and supports end-to-end deployment from ONNX models to FPGA BRAM storage. Evaluated on the VOC dataset, the approach achieves 39.6% mAP50 with only 0.098 GFLOPs and 0.74M parameters. RTL simulation outputs exhibit strong agreement with ONNX predictions, yielding a correlation coefficient of 0.999964 and an average absolute error of 0.020027.
📝 Abstract
This paper implements a Binary Neural Network (BNN) based YOLOv3-tiny-like object detector on a low-cost FPGA. The network takes 320*320*3 RGB images as input. Its main convolution layers use 1-bit weights and 8-bit activations, while Conv1 and the final detection head use fixed-point standard convolutions. From the trained ONNX model, weights, biases, and quantization parameters are extracted, converted to fixed point, packed into COE files, and stored in Vivado BRAM ROMs. The hardware is written fully in Verilog RTL and includes padding, line buffering, binary convolution, quantization post-processing, max pooling, and detection-head computation. For layers where Mul_prev is indexed by input channel and Div_current by output channel, Mul_prev is fused in-to the BNN PE so that channel-wise compensation is applied during accumulation. On VOC, the model obtains 39.6% mAP50 with 0.098 GFLOPs and 0.74 M parameters. RTL simulation shows that the final raw detection output reaches a correlation coefficient of 0.999964 and a mean absolute error of 0.020027 against the corresponding ONNX node.