🤖 AI Summary
Existing diffusion models struggle to achieve fine-grained, pixel-level color control, often yielding generated images that deviate from the specified color gamut. To address this, we propose Color-Aligned Diffusion (CAD), the first method to explicitly project either latent variables or pixel values onto a target color manifold during the diffusion process. CAD enforces color consistency via regularization within the iterative denoising procedure, enabling high-fidelity generation under strict color-gamut constraints. Our approach jointly models conditional color manifolds and aligns projections in both latent and pixel spaces, preserving structural coherence and content diversity while substantially improving color accuracy and controllability. Extensive experiments demonstrate that CAD achieves state-of-the-art performance across multiple color-conditional generation benchmarks, matching the image quality and diversity of standard diffusion models.
📝 Abstract
Diffusion models have shown great promise in synthesizing visually appealing images. However, it remains challenging to condition the synthesis at a fine-grained level, for instance, synthesizing image pixels following some generic color pattern. Existing image synthesis methods often produce contents that fall outside the desired pixel conditions. To address this, we introduce a novel color alignment algorithm that confines the generative process in diffusion models within a given color pattern. Specifically, we project diffusion terms, either imagery samples or latent representations, into a conditional color space to align with the input color distribution. This strategy simplifies the prediction in diffusion models within a color manifold while still allowing plausible structures in generated contents, thus enabling the generation of diverse contents that comply with the target color pattern. Experimental results demonstrate our state-of-the-art performance in conditioning and controlling of color pixels, while maintaining on-par generation quality and diversity in comparison with regular diffusion models.