🤖 AI Summary
This work addresses the limitations of existing intermediate image codecs, which lack screen-content-specific tools and struggle to maintain high visual quality under bandwidth constraints—particularly in dense text regions. To overcome these challenges, the authors propose a lightweight, high-quality intermediate coding scheme that introduces a novel data-independent, high-throughput palette mechanism. This approach is synergistically integrated with adaptive mode selection, rate-distortion-optimized arbitration, and strategic data reuse during encoding. While preserving broad applicability, the method substantially reduces hardware overhead: compared to a 4K@120fps JPEG-XS encoder operating at equivalent throughput, it requires only half the LUT resources and achieves BD-PSNR gains of 3.461 dB, 3.299 dB, and 5.312 dB on gaming, natural, and textual content, respectively.
📝 Abstract
Existing mezzanine image codecs lack specialized screen content coding tools and therefore struggle to maintain high image quality under bandwidth constraints, especially in areas with dense text. Although distribution codecs offer advanced screen content compression techniques, their high computational complexity makes them impractical for mezzanine coding. To address this shortfall, we introduce the High-quality Lightweight Codec (HLC), a solution centered on enabling practical, high-throughput palette for mezzanine coding. The core innovation is a novel data-dependency-free palette that eliminates the throughput bottlenecks. To ensure its effectiveness across all content, a co-designed rate-distortion optimization module arbitrates between the palette and traditional prediction modes, while a data reuse strategy between rate estimation and entropy coding minimizes the overall hardware resources required for the system. Experimental results show that, compared with a 4K@120fps JPEG-XS encoder, HLC achieves the same throughput while using only half the LUT resources and delivers BD-PSNR improvements of 3.461dB, 3.299dB, and 5.312dB on gaming, natural, and text content datasets, respectively.