🤖 AI Summary
This work addresses the insufficient facial identity preservation in professional portrait generation using DreamBooth and InstantID. We propose a face-centric data augmentation strategy for personalized fine-tuning. Methodologically, we introduce FaceDistance—a novel evaluation framework based on FaceNet embeddings—to systematically quantify the impact of geometric, illumination, and stylization augmentations on SDXL-generated portraits, identifying the optimal augmentation combination. Our contributions are twofold: (1) establishing a reproducible, quantitative benchmark for facial identity fidelity; and (2) revealing how targeted augmentations enhance robustness across pose and lighting variations. Experiments demonstrate an average 18.7% improvement in FaceDistance scores, significantly strengthening facial consistency between generated outputs and source images. The approach provides an effective technical pathway for generating high-fidelity professional portraits from low-quality inputs.
📝 Abstract
The personalization of Stable Diffusion for generating professional portraits from amateur photographs is a burgeoning area, with applications in various downstream contexts. This paper investigates the impact of augmentations on improving facial resemblance when using two prominent personalization techniques: DreamBooth and InstantID. Through a series of experiments with diverse subject datasets, we assessed the effectiveness of various augmentation strategies on the generated headshots' fidelity to the original subject. We introduce FaceDistance, a wrapper around FaceNet, to rank the generations based on facial similarity, which aided in our assessment. Ultimately, this research provides insights into the role of augmentations in enhancing facial resemblance in SDXL-generated portraits, informing strategies for their effective deployment in downstream applications.