🤖 AI Summary
This work addresses the challenge of flexibly balancing distortion and perceptual quality (D-P trade-off) during inference in zero-shot inverse problems. The authors propose a two-stage framework, MAP-RPS, which first obtains a low-distortion initial solution via maximum a posteriori (MAP) estimation and then progressively enhances perceptual quality through re-noised posterior sampling. The approach is further extended into latent space as LMAP-RPS to accommodate large-scale pre-trained latent diffusion models. Notably, this method enables, for the first time within a single diffusion model, continuous and controllable traversal of the D-P trade-off at inference time, combining theoretical rigor with computational efficiency. Extensive experiments across various inverse problems demonstrate its superior trade-off performance and practical efficacy.
📝 Abstract
The distortion-perception (D-P) tradeoff is a fundamental phenomenon of Bayesian inverse problems, which characterizes the inherent tension between distortion performance and perceptual quality. Enabling flexible traversal of the D-P tradeoff at inference time is crucial for practical applications. Despite the recent success of diffusion models in zero-shot inverse problem solving, efficient and principled strategies for D-P traversal in diffusion-based inverse algorithms remain inadequately characterized. In this paper, we propose a stage-wise framework for realizing D-P traversal using a single diffusion model in zero-shot inverse problems. Our proposed method, termed MAP-RPS, starts with an MAP estimation stage that approximates the MMSE solution and provides a low-distortion initialization, followed by a re-noised posterior sampling stage that progressively improves perceptual quality. We provide theoretical analyses for both stages, establishing the validity and effectiveness of the proposed design. Furthermore, we extend MAP-RPS to the latent space, yielding LMAP-RPS, which enjoys broader applicability by leveraging large-scale pre-trained latent diffusion backbones. Extensive experiments demonstrate that MAP-RPS and LMAP-RPS enable more effective D-P traversal on various tasks, while also exhibiting strong performance as efficient solvers for real-world inverse problems.