🤖 AI Summary
This work addresses the limitation of conventional neural compression methods, which are constrained to fixed operating points in the rate–distortion–perception (RDP) trade-off and require retraining to adapt to different targets. The authors propose a training-free, universal framework that leverages a pre-trained diffusion model combined with inverse channel coding and a novel score-scaled probability flow ODE decoder. This approach enables, for the first time, continuous and flexible navigation across the entire RDP trade-off space. Theoretically, the proposed decoder is proven to be optimal for the distortion–perception trade-off under Gaussian noise and achieves the optimal RDP function. Experiments on multiple datasets demonstrate that the method simultaneously attains theoretical optimality and practical adaptive compression performance.
📝 Abstract
The rate-distortion-perception (RDP) tradeoff characterizes the fundamental limits of lossy compression by jointly considering bitrate, reconstruction fidelity, and perceptual quality. While recent neural compression methods have improved perceptual performance, they typically operate at fixed points on the RDP surface, requiring retraining to target different tradeoffs. In this work, we propose a training-free framework that leverages pre-trained diffusion models to traverse the entire RDP surface. Our approach integrates a reverse channel coding (RCC) module with a novel score-scaled probability flow ODE decoder. We theoretically prove that the proposed diffusion decoder is optimal for the distortion-perception tradeoff under AWGN observations and that the overall framework with the RCC module achieves the optimal RDP function in the Gaussian case. Empirical results across multiple datasets demonstrate the framework's flexibility and effectiveness in navigating the ternary RDP tradeoff using pre-trained diffusion models. Our results establish a practical and theoretically grounded approach to adaptive, perception-aware compression.