Penalizing Boundary Activation for Object Completeness in Diffusion Models

📅 2025-09-21

📈 Citations: 0

✨ Influential: 0

career value

161K/year

🤖 AI Summary

Diffusion models for text-to-image (T2I) generation frequently suffer from object fragmentation or incompleteness, undermining downstream utility. We identify RandomCrop—a widely adopted data augmentation during pretraining—as a primary cause of this issue. To address it, we propose a fine-tuning-free boundary activation penalty: during early denoising steps in Stable Diffusion, we suppress feature activations in the boundary regions of UNet’s intermediate feature maps, thereby encouraging globally coherent and structurally complete object generation. Our method operates solely via forward-inference-time feature modulation, incurring negligible computational overhead. Experiments demonstrate consistent and significant improvements in object completeness and overall image quality across multiple benchmarks. It generalizes effectively across diverse prompts and scene configurations, requiring no architectural modification or retraining. This work establishes a novel plug-and-play paradigm for integrity-aware T2I generation.

Technology Category

Application Category

📝 Abstract

Diffusion models have emerged as a powerful technique for text-to-image (T2I) generation, creating high-quality, diverse images across various domains. However, a common limitation in these models is the incomplete display of objects, where fragments or missing parts undermine the model's performance in downstream applications. In this study, we conduct an in-depth analysis of the incompleteness issue and reveal that the primary factor behind incomplete object generation is the usage of RandomCrop during model training. This widely used data augmentation method, though enhances model generalization ability, disrupts object continuity during training. To address this, we propose a training-free solution that penalizes activation values at image boundaries during the early denoising steps. Our method is easily applicable to pre-trained Stable Diffusion models with minimal modifications and negligible computational overhead. Extensive experiments demonstrate the effectiveness of our method, showing substantial improvements in object integrity and image quality.

Problem

Research questions and friction points this paper is trying to address.

Addressing incomplete object generation in diffusion models

Identifying RandomCrop training as cause of object fragmentation

Proposing boundary activation penalty to enhance object completeness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Penalizes boundary activation in early denoising

Training-free solution for Stable Diffusion

Addresses object incompleteness from RandomCrop

🔎 Similar Papers

ConsistencyDet: A Few-step Denoising Framework for Object Detection Using the Consistency Model