🤖 AI Summary
Conventional wisdom holds that lossy JPEG compression degrades deep neural network (DNN) performance. This work challenges that assumption by proposing JPEG-DL: a framework embedding a differentiable, trainable JPEG compression layer directly into the front end of arbitrary DNNs. The layer operates in the discrete cosine transform (DCT) domain and introduces a novel soft quantizer, approximating the full JPEG encoding pipeline—including quantization, zigzag scanning, and entropy coding—within an end-to-end differentiable architecture amenable to joint optimization. Its core contribution is the first fully differentiable and parameterized realization of JPEG compression, enabling learned adaptation of compression parameters and revealing the beneficial roles of lossy compression in feature selection and noise suppression. Experiments demonstrate substantial improvements in image classification accuracy—up to +20.9% on fine-grained tasks—and enhanced robustness against adversarial attacks. The method’s effectiveness is validated across multiple benchmarks and backbone architectures.
📝 Abstract
Although it is traditionally believed that lossy image compression, such as JPEG compression, has a negative impact on the performance of deep neural networks (DNNs), it is shown by recent works that well-crafted JPEG compression can actually improve the performance of deep learning (DL). Inspired by this, we propose JPEG-DL, a novel DL framework that prepends any underlying DNN architecture with a trainable JPEG compression layer. To make the quantization operation in JPEG compression trainable, a new differentiable soft quantizer is employed at the JPEG layer, and then the quantization operation and underlying DNN are jointly trained. Extensive experiments show that in comparison with the standard DL, JPEG-DL delivers significant accuracy improvements across various datasets and model architectures while enhancing robustness against adversarial attacks. Particularly, on some fine-grained image classification datasets, JPEG-DL can increase prediction accuracy by as much as 20.9%. Our code is available on https://github.com/AhmedHussKhalifa/JPEG-Inspired-DL.git.