Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model

📅 2024-03-12

🏛️ arXiv.org

📈 Citations: 12

✨ Influential: 0

career value

232K/year

🤖 AI Summary

Existing makeup transfer methods struggle with the complex and diverse makeup styles encountered in real-world scenarios. This paper proposes a high-fidelity makeup transfer framework tailored for realistic applications. It introduces a novel makeup disentanglement encoder to explicitly separate facial content from structural attributes, and incorporates a makeup-guided cross-attention mechanism within a U-Net architecture to preserve identity, facial geometry, and semantic consistency with the source image. Built upon a pre-trained diffusion model, our approach integrates a detail-preserving makeup encoder, dual-control modules for content and structure, and a customized attention layer—enabling not only cross-domain makeup transfer but also makeup-conditioned text-to-image generation. Extensive experiments on multiple public benchmarks demonstrate state-of-the-art performance, with strong generalization and robustness across diverse makeup styles and lighting conditions. The code is publicly available.

Technology Category

Application Category

📝 Abstract

Current makeup transfer methods are limited to simple makeup styles, making them difficult to apply in real-world scenarios. In this paper, we introduce Stable-Makeup, a novel diffusion-based makeup transfer method capable of robustly transferring a wide range of real-world makeup, onto user-provided faces. Stable-Makeup is based on a pre-trained diffusion model and utilizes a Detail-Preserving (D-P) makeup encoder to encode makeup details. It also employs content and structural control modules to preserve the content and structural information of the source image. With the aid of our newly added makeup cross-attention layers in U-Net, we can accurately transfer the detailed makeup to the corresponding position in the source image. After content-structure decoupling training, Stable-Makeup can maintain content and the facial structure of the source image. Moreover, our method has demonstrated strong robustness and generalizability, making it applicable to varioustasks such as cross-domain makeup transfer, makeup-guided text-to-image generation and so on. Extensive experiments have demonstrated that our approach delivers state-of-the-art (SOTA) results among existing makeup transfer methods and exhibits a highly promising with broad potential applications in various related fields. Code released: https://github.com/Xiaojiu-z/Stable-Makeup

Problem

Research questions and friction points this paper is trying to address.

Transferring diverse real-world makeup styles robustly

Preserving source image content and facial structure

Enabling cross-domain makeup transfer and text-to-image generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion-based makeup transfer method

Detail-Preserving makeup encoder

Content-structure decoupling training

🔎 Similar Papers

No similar papers found.