🤖 AI Summary
To address the challenge of achieving high-performance beamforming for ultrasound non-guided plane-wave imaging on resource-constrained edge devices, this paper proposes CapsBeam—a novel capsule network that, for the first time, directly reconstructs envelope images from raw radio-frequency (RF) ultrasound data using capsule mechanisms. Methodologically, CapsBeam integrates multi-layer LookAhead kernel pruning (LAKP-ML), hardware-accelerated dynamic routing, quantization, and nonlinear simplification, enabling efficient FPGA deployment. Experimental results demonstrate significant improvements over delay-and-sum (DAS): a 32.31% contrast enhancement and 16.54%/6.7% axial/lateral resolution gains on phantom data; 26% contrast improvement and 13.6%–21.5% resolution enhancement on silicon phantom experiments. The model achieves an 85% compression rate, and the FPGA implementation attains a convolutional throughput of 30 GOPS.
📝 Abstract
In recent years, there has been a growing trend in accelerating computationally complex non-real-time beamforming algorithms in ultrasound imaging using deep learning models. However, due to the large size and complexity these state-of-the-art deep learning techniques poses significant challenges when deploying on resource-constrained edge devices. In this work, we propose a novel capsule network based beamformer called CapsBeam, designed to operate on raw radio-frequency data and provide an envelope of beamformed data through non-steered plane wave insonification. Experiments on in-vivo data, CapsBeam reduced artifacts compared to the standard Delay-and-Sum (DAS) beamforming. For in-vitro data, CapsBeam demonstrated a 32.31% increase in contrast, along with gains of 16.54% and 6.7% in axial and lateral resolution compared to the DAS. Similarly, in-silico data showed a 26% enhancement in contrast, along with improvements of 13.6% and 21.5% in axial and lateral resolution, respectively, compared to the DAS. To reduce the parameter redundancy and enhance the computational efficiency, we pruned the model using our multi-layer LookAhead Kernel Pruning (LAKP-ML) methodology, achieving a compression ratio of 85% without affecting the image quality. Additionally, the hardware complexity of the proposed model is reduced by applying quantization, simplification of non-linear operations, and parallelizing operations. Finally, we proposed a specialized accelerator architecture for the pruned and optimized CapsBeam model, implemented on a Xilinx ZU7EV FPGA. The proposed accelerator achieved a throughput of 30 GOPS for the convolution operation and 17.4 GOPS for the dynamic routing operation.