🤖 AI Summary
This work addresses the high computational complexity of motion vector search in the Intra Pattern Copy (IPC) tool of the JPEG XS standard, which poses significant challenges for efficient hardware deployment. To overcome this, the authors propose a dedicated pipelined FPGA architecture tailored to the IPC search process, leveraging data reuse characteristics to optimize memory organization. This optimization substantially reduces both computational overhead and power consumption. Implemented on a Xilinx FPGA, the design achieves a throughput of 38.3 megapixels per second while consuming only 277 mW, demonstrating its hardware feasibility and providing a solid foundation for future ASIC implementations.
📝 Abstract
Recently, progress has been made on the Intra Pattern Copy (IPC) tool for JPEG XS, an image compression standard designed for low-latency and low-complexity coding. IPC performs wavelet-domain intra compensation predictions to reduce spatial redundancy in screen content. A key module of IPC is the displacement vector (DV) search, which aims to solve the optimal prediction reference offset. However, the DV search process is computationally intensive, posing challenges for practical hardware deployment. In this paper, we propose an efficient pipelined FPGA architecture design for the DV search module to promote the practical deployment of IPC. Optimized memory organization, which leverages the IPC computational characteristics and data inherent reuse patterns, is further introduced to enhance the performance. Experimental results show that our proposed architecture achieves a throughput of 38.3 Mpixels/s with a power consumption of 277 mW, demonstrating its feasibility for practical hardware implementation in IPC and other predictive coding tools, and providing a promising foundation for ASIC deployment.