Data Compression with Relative Entropy Coding

📅 2025-06-19

📈 Citations: 0

✨ Influential: 0

career value

282K/year

🤖 AI Summary

This work addresses theoretical and practical bottlenecks in data compression—specifically, continuous-space modeling, end-to-end differentiability for machine learning integration, and joint optimization of privacy preservation and perceptual quality. We propose a novel compression framework grounded in Relative Entropy Coding (REC). First, we derive the tight information-theoretic limits of REC. Second, we design an optimal coding algorithm based on Poisson point processes. Third, we develop the first lightweight Bayesian implicit neural representation compression system tailored for multimodal data. The framework enables fully differentiable, end-to-end training and natively supports probabilistic neural networks and implicit representations. Extensive evaluation across images, audio, video, and protein structure data demonstrates compression ratios approaching the theoretical optimum—substantially outperforming conventional quantization methods—while simultaneously ensuring strong privacy guarantees and high human-perceptual fidelity.

Technology Category

Application Category

📝 Abstract

Over the last few years, machine learning unlocked previously infeasible features for compression, such as providing guarantees for users'privacy or tailoring compression to specific data statistics (e.g., satellite images or audio recordings of animals) or users'audiovisual perception. This, in turn, has led to an explosion of theoretical investigations and insights that aim to develop new fundamental theories, methods and algorithms better suited for machine learning-based compressors. In this thesis, I contribute to this trend by investigating relative entropy coding, a mathematical framework that generalises classical source coding theory. Concretely, relative entropy coding deals with the efficient communication of uncertain or randomised information. One of its key advantages is that it extends compression methods to continuous spaces and can thus be integrated more seamlessly into modern machine learning pipelines than classical quantisation-based approaches. Furthermore, it is a natural foundation for developing advanced compression methods that are privacy-preserving or account for the perceptual quality of the reconstructed data. The thesis considers relative entropy coding at three conceptual levels: After introducing the basics of the framework, (1) I prove results that provide new, maximally tight fundamental limits to the communication and computational efficiency of relative entropy coding; (2) I use the theory of Poisson point processes to develop and analyse new relative entropy coding algorithms, whose performance attains the theoretic optima and (3) I showcase the strong practical performance of relative entropy coding by applying it to image, audio, video and protein data compression using small, energy-efficient, probabilistic neural networks called Bayesian implicit neural representations.

Problem

Research questions and friction points this paper is trying to address.

Extends compression to continuous spaces via relative entropy coding

Develops privacy-preserving and perception-aware compression methods

Proves tight limits and efficient algorithms for relative entropy coding

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generalizes classical coding with relative entropy

Uses Poisson processes for optimal algorithms

Applies Bayesian networks for practical compression

🔎 Similar Papers

A Survey on Error-Bounded Lossy Compression for Scientific Datasets