Edge-preserving noise for diffusion models

📅 2024-10-02
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional diffusion models employ isotropic Gaussian denoising, which struggles to preserve image edges and structural details. To address this, we propose Edge-Aware Diffusion (EADiff), the first generative model to incorporate anisotropic diffusion principles: it introduces an edge-detection-guided adaptive noise scheduler that dynamically blends edge-preserving noise with standard Gaussian noise, and augments structural representation via frequency-domain analysis—specifically enhancing low-to-mid-frequency components. Evaluated on unconditional image generation, EADiff achieves ~30% improvements in both FID and CLIP scores. It also demonstrates markedly enhanced robustness and shape fidelity in structure-sensitive tasks such as sketch-to-image translation. Our core contribution is the first structural-aware noise scheduling mechanism in diffusion modeling—overcoming the conventional diffusion process’s neglect of geometric priors while simultaneously improving modeling accuracy and convergence efficiency.

Technology Category

Application Category

📝 Abstract
Classical generative diffusion models learn an isotropic Gaussian denoising process, treating all spatial regions uniformly, thus neglecting potentially valuable structural information in the data. Inspired by the long-established work on anisotropic diffusion in image processing, we present a novel edge-preserving diffusion model that generalizes over existing isotropic models by considering a hybrid noise scheme. In particular, we introduce an edge-aware noise scheduler that varies between edge-preserving and isotropic Gaussian noise. We show that our model's generative process converges faster to results that more closely match the target distribution. We demonstrate its capability to better learn the low-to-mid frequencies within the dataset, which plays a crucial role in representing shapes and structural information. Our edge-preserving diffusion process consistently outperforms state-of-the-art baselines in unconditional image generation. It is also particularly more robust for generative tasks guided by a shape-based prior, such as stroke-to-image generation. We present qualitative and quantitative results (FID and CLIP score) showing consistent improvements of up to 30% for both tasks.
Problem

Research questions and friction points this paper is trying to address.

Develops edge-preserving diffusion model for structural data
Improves noise scheduler for better frequency learning
Enhances image generation with shape-based priors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Edge-preserving noise scheduler
Hybrid isotropic-anisotropic diffusion
Faster convergence with structural fidelity
🔎 Similar Papers
2024-09-16Philosophical transactions. Series A, Mathematical, physical, and engineering sciencesCitations: 8
J
Jente Vandersanden
Max Planck Institute for Informatics, Germany
S
Sascha Holl
Max Planck Institute for Informatics, Germany
Xingchang Huang
Xingchang Huang
Max Planck Institute for Informatics, Germany
Gurprit Singh
Gurprit Singh
Advanced Micro devices (AMD)
Generative ModelsMCMCGenerative Rendering