Flexible Mixed Precision Quantization for Learne Image Compression

📅 2024-07-15

🏛️ IEEE International Conference on Multimedia and Expo

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

To address inefficient hardware resource utilization caused by uniform-bitwidth quantization in learned image compression (LIC) models, this paper proposes a flexible mixed-precision quantization framework. The method introduces a novel layer-wise sensitivity metric—the fractional rate-distortion loss change ratio—to guide bitwidth allocation across layers. Furthermore, it devises an efficient adaptive search algorithm that rapidly identifies the optimal mixed-precision configuration under a given model size constraint. Experimental results demonstrate that the proposed approach significantly improves hardware resource efficiency and achieves, on average, superior BD-Rate performance compared to state-of-the-art LIC quantization methods at equivalent parameter counts. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract

Despite its improvements in coding performance compared to traditional codecs, Learned Image Compression (LIC) suffers from large computational costs for storage and deployment. Model quantization offers an effective solution to reduce the computational complexity of LIC models. However, most existing works perform fixed-precision quantization which suffers from sub-optimal utilization of resources due to the varying sensitivity to quantization of different layers of a neural network. In this paper, we propose a Flexible Mixed Precision Quantization (FMPQ) method that assigns different bit-widths to different layers of the quantized network using the fractional change in rate-distortion loss as the bit-assignment criterion. We also introduce an adaptive search algorithm which reduces the time-complexity of searching for the desired distribution of quantization bit-widths given a fixed model size. Evaluation of our method shows improved BD-Rate performance under similar model size constraints compared to other works on quantization of LIC models. We have made the source code available at gitlab.com/viper-purdue/fmpq.

Problem

Research questions and friction points this paper is trying to address.

Reducing computational costs in Learned Image Compression

Overcoming sub-optimal fixed-precision quantization in neural networks

Optimizing bit-width distribution for quantization efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Flexible Mixed Precision Quantization for layers

Adaptive search algorithm reduces time-complexity

Fractional change in rate-distortion as criterion

🔎 Similar Papers

Quantization-aware Matrix Factorization for Low Bit Rate Image Compression