ControlFill: Spatially Adjustable Image Inpainting from Prompt Learning

📅 2025-03-06

📈 Citations: 0

✨ Influential: 0

career value

148K/year

🤖 AI Summary

To address the challenge of fine-grained user control over semantic content and boundary fusion strength in image inpainting, this paper proposes ControlFill—a novel framework for interactive, pixel-level manipulation. It introduces a dual-prompt decoupling mechanism that separately models “object generation” and “background extension” semantics, coupled with a pixel-wise spatially variant guidance scaling strategy that enables lightweight, encoder-free control. By jointly optimizing prompt weights and local guidance intensity, ControlFill integrates prompt learning and classifier-free guidance into diffusion-based inpainting, achieving high-fidelity synthesis while significantly enhancing controllability over repaired region semantics, spatial placement, and boundary blending. Experiments demonstrate that ControlFill strikes an exceptional balance between controllability and fidelity, outperforming existing methods in fine-grained user-directed editing.

Technology Category

Application Category

📝 Abstract

In this report, I present an inpainting framework named extit{ControlFill}, which involves training two distinct prompts: one for generating plausible objects within a designated mask ( extit{creation}) and another for filling the region by extending the background ( extit{removal}). During the inference stage, these learned embeddings guide a diffusion network that operates without requiring heavy text encoders. By adjusting the relative significance of the two prompts and employing classifier-free guidance, users can control the intensity of removal or creation. Furthermore, I introduce a method to spatially vary the intensity of guidance by assigning different scales to individual pixels.

Problem

Research questions and friction points this paper is trying to address.

Develops ControlFill for adjustable image inpainting.

Uses dual prompts for object creation and background removal.

Enables spatial control of inpainting intensity per pixel.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses two distinct prompts for object creation and background removal.

Employs diffusion network without heavy text encoders.

Spatially varies guidance intensity by pixel-specific scales.

🔎 Similar Papers

No similar papers found.