SketchDeco: Decorating B&W Sketches with Colour

📅 2024-05-29
🏛️ arXiv.org
📈 Citations: 5
Influential: 0
📄 PDF

career value

165K/year
🤖 AI Summary
To address insufficient professional controllability, ambiguous text prompts, and training dependency in sketch coloring, this paper proposes a fine-tuning-free, guided multi-stage coloring framework. Methodologically, it integrates ControlNet and region-specific masks into Stable Diffusion v1.5, introducing three novel components: (i) a sketch inversion mechanism for faithful latent initialization; (ii) a guidance-aware sampling strategy for structured color propagation; and (iii) a scaled self-attention module to harmonize local detail fidelity with global color consistency. Semantic understanding from BLIP-2 is fused with user-specified palettes to enable fine-grained, interactive control. The framework achieves second-level inference on a single RTX 4090 Super, delivering high-fidelity outputs with strong intent alignment—suitable for professional applications such as design prototyping and storyboarding. To our knowledge, this is the first work achieving zero-shot, highly controllable, and real-time sketch coloring, significantly advancing both generation quality and interactive efficiency.

Technology Category

Application Category

📝 Abstract
This paper introduces a novel approach to sketch colourisation, inspired by the universal childhood activity of colouring and its professional applications in design and story-boarding. Striking a balance between precision and convenience, our method utilises region masks and colour palettes to allow intuitive user control, steering clear of the meticulousness of manual colour assignments or the limitations of textual prompts. By strategically combining ControlNet and staged generation, incorporating Stable Diffusion v1.5, and leveraging BLIP-2 text prompts, our methodology facilitates faithful image generation and user-directed colourisation. Addressing challenges of local and global consistency, we employ inventive solutions such as an inversion scheme, guided sampling, and a self-attention mechanism with a scaling factor. The resulting tool is not only fast and training-free but also compatible with consumer-grade Nvidia RTX 4090 Super GPUs, making it a valuable asset for both creative professionals and enthusiasts in various fields. Project Page: url{https://chaitron.github.io/SketchDeco/}
Problem

Research questions and friction points this paper is trying to address.

Bridges professional design needs with intuitive region-based color control
Enables precise spatial color specification without manual assignment or text prompts
Achieves local color fidelity and global harmony without model fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Training-free latent composition for sketch colorization
Guided latent-space blending with diffusion inversion
Custom self-attention mechanism ensures local-global harmony
🔎 Similar Papers
No similar papers found.