๐ค AI Summary
Existing learned image compression (LIC) methods typically employ static, uniform quantization bit-widths, which fail to accommodate the heterogeneous and sensitivity-diverse feature distributions within modelsโleading to suboptimal trade-offs between rate-distortion performance and computational efficiency. To address this, we propose DynaQuant, a dynamic mixed-precision quantization framework featuring a novel dual-level bit-width selection mechanism: content-aware quantization at the feature level and data-driven adaptation at the channel level. We further introduce a distance-aware gradient modulator (DGM) to enable end-to-end differentiable optimization. Leveraging learnable scaling/offset parameters and a lightweight bit-width selection network, DynaQuant adaptively allocates precision at both feature- and channel-granularities. Experiments demonstrate that DynaQuant preserves the rate-distortion performance of full-precision models while significantly reducing computational cost and memory footprint, thereby enhancing deployment flexibility across diverse hardware platforms.
๐ Abstract
Prevailing quantization techniques in Learned Image Compression (LIC) typically employ a static, uniform bit-width across all layers, failing to adapt to the highly diverse data distributions and sensitivity characteristics inherent in LIC models. This leads to a suboptimal trade-off between performance and efficiency. In this paper, we introduce DynaQuant, a novel framework for dynamic mixed-precision quantization that operates on two complementary levels. First, we propose content-aware quantization, where learnable scaling and offset parameters dynamically adapt to the statistical variations of latent features. This fine-grained adaptation is trained end-to-end using a novel Distance-aware Gradient Modulator (DGM), which provides a more informative learning signal than the standard Straight-Through Estimator. Second, we introduce a data-driven, dynamic bit-width selector that learns to assign an optimal bit precision to each layer, dynamically reconfiguring the network's precision profile based on the input data. Our fully dynamic approach offers substantial flexibility in balancing rate-distortion (R-D) performance and computational cost. Experiments demonstrate that DynaQuant achieves rd performance comparable to full-precision models while significantly reducing computational and storage requirements, thereby enabling the practical deployment of advanced LIC on diverse hardware platforms.