FD-LSCIC: Frequency Decomposition-based Learned Screen Content Image Compression

📅 2025-02-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the performance degradation of existing learning-based methods for screen content (SC) image compression, caused by sharp edges, embedded text/graphics, and repetitive textures. We propose the first end-to-end frequency-decomposition learning framework for SC compression. Methodologically, we design a multi-frequency two-stage octave residual block and a cascaded three-scale feature fusion module, introduce a frequency-domain adaptive quantization mechanism, and construct SDU-SCICD10K—the first large-scale, SC-specific dataset comprising 10,000 images. Our key contributions are: (i) the first integration of multi-frequency modeling and adaptive quantization within a unified SC compression framework; and (ii) state-of-the-art performance—our model significantly outperforms HEVC, VVC, and leading learned codecs in both PSNR and MS-SSIM, especially at high compression ratios, while preserving text/graphic fidelity. This establishes a new paradigm for efficient SC image compression.

Technology Category

Application Category

📝 Abstract
The learned image compression (LIC) methods have already surpassed traditional techniques in compressing natural scene (NS) images. However, directly applying these methods to screen content (SC) images, which possess distinct characteristics such as sharp edges, repetitive patterns, embedded text and graphics, yields suboptimal results. This paper addresses three key challenges in SC image compression: learning compact latent features, adapting quantization step sizes, and the lack of large SC datasets. To overcome these challenges, we propose a novel compression method that employs a multi-frequency two-stage octave residual block (MToRB) for feature extraction, a cascaded triple-scale feature fusion residual block (CTSFRB) for multi-scale feature integration and a multi-frequency context interaction module (MFCIM) to reduce inter-frequency correlations. Additionally, we introduce an adaptive quantization module that learns scaled uniform noise for each frequency component, enabling flexible control over quantization granularity. Furthermore, we construct a large SC image compression dataset (SDU-SCICD10K), which includes over 10,000 images spanning basic SC images, computer-rendered images, and mixed NS and SC images from both PC and mobile platforms. Experimental results demonstrate that our approach significantly improves SC image compression performance, outperforming traditional standards and state-of-the-art learning-based methods in terms of peak signal-to-noise ratio (PSNR) and multi-scale structural similarity (MS-SSIM).
Problem

Research questions and friction points this paper is trying to address.

Screen content image compression
Learning compact latent features
Adapting quantization step sizes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-frequency feature extraction
Adaptive quantization module
Large SC image dataset
🔎 Similar Papers
No similar papers found.
S
Shiqi Jiang
School of Control Science and Engineering, Shandong University, Ji’nan, 250100, China
H
Hui Yuan
School of Control Science and Engineering, Shandong University, Ji’nan, 250100, China, and also with the Shandong Inspur Artificial Intelligence Research Institute Co., Ltd., Ji’nan, China
S
Shuai Li
School of Control Science and Engineering, Shandong University, Ji’nan, 250100, China
Huanqiang Zeng
Huanqiang Zeng
Huaqiao University, China
Image ProcessingVideo CodingComputer Vision
Sam Kwong
Sam Kwong
Lingnan Univerity, Hong Kong
Video CodingEvolutionary ComputationMachine Learning and pattern recognition