Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression

📅 2025-03-27

📈 Citations: 0

✨ Influential: 0

career value

230K/year

🤖 AI Summary

Existing autoencoder-based image compression methods suffer from suboptimal rate-distortion performance at high bitrates and lack single-model support for broad-rate adaptive coding. To address this, we propose a reversible-transform-based variable-rate image compression model. Our method introduces a lightweight multi-scale invertible neural network that establishes a bijective mapping between the image and a hierarchical latent space, ensuring lossless invertibility and full-rate adaptability. We further design a novel entropy estimation mechanism that jointly leverages multi-scale spatial–semantic context modeling and an expansion gain unit. Evaluated across an exceptionally wide bitrate range (0.01–1.0 bpp), our single model consistently outperforms the VVC standard—particularly at high bitrates—and achieves state-of-the-art variable-rate performance, matching or exceeding multi-model approaches. The source code is publicly available.

Technology Category

Application Category

📝 Abstract

Autoencoder-based structures have dominated recent learned image compression methods. However, the inherent information loss associated with autoencoders limits their rate-distortion performance at high bit rates and restricts their flexibility of rate adaptation. In this paper, we present a variable-rate image compression model based on invertible transform to overcome these limitations. Specifically, we design a lightweight multi-scale invertible neural network, which bijectively maps the input image into multi-scale latent representations. To improve the compression efficiency, a multi-scale spatial-channel context model with extended gain units is devised to estimate the entropy of the latent representation from high to low levels. Experimental results demonstrate that the proposed method achieves state-of-the-art performance compared to existing variable-rate methods, and remains competitive with recent multi-model approaches. Notably, our method is the first learned image compression solution that outperforms VVC across a very wide range of bit rates using a single model, especially at high bit rates.The source code is available at href{https://github.com/hytu99/MSINN-VRLIC}{https://github.com/hytu99/MSINN-VRLIC}.

Problem

Research questions and friction points this paper is trying to address.

Overcoming autoencoder limitations in image compression

Enhancing rate-distortion performance at high bit rates

Achieving wide-range variable-rate compression with one model

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-scale invertible neural network for compression

Bijective mapping into multi-scale latent representations

Multi-scale spatial-channel context model for entropy

🔎 Similar Papers

No similar papers found.