🤖 AI Summary
This work addresses the limitation of existing region-of-interest (ROI)-based image compression methods, which employ Gaussian models to approximate the heavy-tailed, spiky distributions of latent variables, thereby constraining rate-distortion performance. To overcome this, we propose the first generalized Gaussian probability model (GGM) tailored for ROI compression, integrated within a unified rate-distortion optimization framework. By leveraging a differentiable approximation function, dynamic lower-bound constraints, and finite-difference gradient estimation, our approach enables accurate modeling of complex latent distributions and stable training, effectively mitigating the train-test mismatch issue. Experiments on COCO2017 demonstrate that the proposed method significantly outperforms existing approaches in ROI reconstruction quality and achieves superior coding efficiency on downstream tasks such as segmentation and object detection.
📝 Abstract
Region-of-Interest (ROI)-based image compression allocates bits unevenly according to the semantic importance of different regions. Such differentiated coding typically induces a sharp-peaked and heavy-tailed distribution. This distribution characteristic mathematically necessitates a probability model with adaptable shape parameters for accurate description. However, existing methods commonly use a Gaussian model to fit this distribution, resulting in a loss of coding performance. To systematically analyze the impact of this distribution on ROI coding, we develop a unified rate-distortion optimization theoretical paradigm. Building on this paradigm, we propose a novel Generalized Gaussian Model (GGM) to achieve flexible modeling of the latent variables distribution. To support stable optimization of GGM, we introduce effective differentiable functions and further propose a dynamic lower bound to alleviate train-test mismatch. Moreover, finite differences are introduced to solve the gradient computation after GGM fits the distribution. Experiments on COCO2017 demonstrate that our method achieves state-of-the-art in both ROI reconstruction and downstream tasks (e.g., Segmentation, Object Detection). Furthermore, compared to classical probability models, our GGM provides a more precise fit to feature distributions and achieves superior coding performance. The project page is at https://github.com/hukai-tju/ROIGGM.