🤖 AI Summary
Conventional image coding standards struggle to simultaneously optimize for human visual fidelity and downstream machine vision task performance. Method: JPEG AI introduces the first international learned image coding standard jointly targeting human perception and machine understanding. It employs an end-to-end differentiable deep learning architecture that jointly optimizes multiple perceptual quality metrics—including MS-SSIM, FSIM, VIF, and VMAF—to achieve a unified representation supporting both human visualization and machine analysis within a single compact bitstream. Contribution/Results: Compared to JPEG and VVC, JPEG AI achieves average BD-rate reductions of 15–30% across key perceptual metrics (e.g., PSNR-HVS, IW-SSIM, NLPD), significantly improving compression efficiency and cross-device/application interoperability. Notably, it marks the first systematic integration of deep learning–based image coding into the ISO/IEC standardization framework.
📝 Abstract
JPEG AI is an emerging learning-based image coding standard developed by Joint Photographic Experts Group (JPEG). The scope of the JPEG AI is the creation of a practical learning-based image coding standard offering a single-stream, compact compressed domain representation, targeting both human visualization and machine consumption. Scheduled for completion in early 2025, the first version of JPEG AI focuses on human vision tasks, demonstrating significant BD-rate reductions compared to existing standards, in terms of MS-SSIM, FSIM, VIF, VMAF, PSNR-HVS, IW-SSIM and NLPD quality metrics. Designed to ensure broad interoperability, JPEG AI incorporates various design features to support deployment across diverse devices and applications. This paper provides an overview of the technical features and characteristics of the JPEG AI standard.