TurboFill: Adapting Few-step Text-to-image Model for Fast Image Inpainting

📅 2025-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of balancing image quality and inference efficiency in few-step text-to-image diffusion models for image inpainting, this paper proposes TurboFill: a lightweight mask-aware inpainting adapter built upon the three-step distilled diffusion model DMD2, coupled with a novel three-stage adversarial training strategy. Methodologically, TurboFill integrates few-step diffusion distillation, mask-conditioned modeling, and adapter-based fine-tuning to significantly reduce computational overhead. Our contributions are threefold: (1) We introduce two new benchmarks—DilationBench and HumanBench—designed to reflect real-world inpainting requirements and emphasize human visual alignment in evaluation; (2) TurboFill achieves state-of-the-art trade-offs between speed and quality, outperforming multi-step BrushNet and existing few-step methods across diverse mask scales and complex text prompts. Experimental results demonstrate substantial improvements in both fidelity and efficiency, establishing a new benchmark for practical, high-fidelity few-step inpainting.

Technology Category

Application Category

📝 Abstract
This paper introduces TurboFill, a fast image inpainting model that enhances a few-step text-to-image diffusion model with an inpainting adapter for high-quality and efficient inpainting. While standard diffusion models generate high-quality results, they incur high computational costs. We overcome this by training an inpainting adapter on a few-step distilled text-to-image model, DMD2, using a novel 3-step adversarial training scheme to ensure realistic, structurally consistent, and visually harmonious inpainted regions. To evaluate TurboFill, we propose two benchmarks: DilationBench, which tests performance across mask sizes, and HumanBench, based on human feedback for complex prompts. Experiments show that TurboFill outperforms both multi-step BrushNet and few-step inpainting methods, setting a new benchmark for high-performance inpainting tasks. Our project page: https://liangbinxie.github.io/projects/TurboFill/
Problem

Research questions and friction points this paper is trying to address.

Enhancing few-step text-to-image model for fast inpainting
Reducing computational costs in diffusion-based inpainting
Ensuring realistic and harmonious inpainted regions efficiently
Innovation

Methods, ideas, or system contributions that make the work stand out.

Few-step text-to-image model adaptation
3-step adversarial training scheme
DilationBench and HumanBench evaluation benchmarks
🔎 Similar Papers
No similar papers found.