🤖 AI Summary
To address the high computational complexity of learned image compression models, hindering their deployment on edge devices, this paper proposes a low-complexity hierarchical feature coding framework. The core method introduces a learnable hierarchical feature transform that dynamically balances channel count and spatial resolution—adapting between high-resolution–low-channel and low-resolution–high-channel representations. It further integrates channel-spatial decoupled quantization, a lightweight entropy model, and end-to-end rate-distortion optimization. Experimental results demonstrate that the proposed model reduces forward computational complexity from 1256 kMAC/pixel to 270 kMAC/pixel—a 4.7× reduction—achieving the state-of-the-art efficiency–complexity trade-off. Crucially, it maintains competitive rate-distortion performance (SOTA) while enabling real-time inference on resource-constrained edge hardware.
📝 Abstract
Current learned image compression models typically exhibit high complexity, which demands significant computational resources. To overcome these challenges, we propose an innovative approach that employs hierarchical feature extraction transforms to significantly reduce complexity while preserving bit rate reduction efficiency. Our novel architecture achieves this by using fewer channels for high spatial resolution inputs/feature maps. On the other hand, feature maps with a large number of channels have reduced spatial dimensions, thereby cutting down on computational load without sacrificing performance. This strategy effectively reduces the forward pass complexity from (1256 , ext{kMAC/Pixel}) to just (270 , ext{kMAC/Pixel}). As a result, the reduced complexity model can open the way for learned image compression models to operate efficiently across various devices and pave the way for the development of new architectures in image compression technology.