FLUX-Makeup: High-Fidelity, Identity-Consistent, and Robust Makeup Transfer via Diffusion Transformer

📅 2025-08-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses high-fidelity, identity-preserving, and robust makeup transfer without auxiliary facial control modules. We propose RefLoRAInjector, a lightweight makeup feature injector that decouples the reference image pathway from the backbone network, enabling end-to-end makeup feature learning from source–reference image pairs. Built upon the FLUX-Kontext diffusion Transformer architecture, our method integrates conditional image input with the RefLoRAInjector structure and introduces a high-precision paired makeup data generation pipeline to enhance supervision quality. Experiments demonstrate that our approach significantly outperforms existing methods across diverse scenarios, achieving state-of-the-art performance in makeup fidelity, identity consistency, and robustness to cross-pose and cross-illumination variations—while eliminating reliance on complex auxiliary components such as facial landmarks or 3D morphable models.

Technology Category

Application Category

📝 Abstract
Makeup transfer aims to apply the makeup style from a reference face to a target face and has been increasingly adopted in practical applications. Existing GAN-based approaches typically rely on carefully designed loss functions to balance transfer quality and facial identity consistency, while diffusion-based methods often depend on additional face-control modules or algorithms to preserve identity. However, these auxiliary components tend to introduce extra errors, leading to suboptimal transfer results. To overcome these limitations, we propose FLUX-Makeup, a high-fidelity, identity-consistent, and robust makeup transfer framework that eliminates the need for any auxiliary face-control components. Instead, our method directly leverages source-reference image pairs to achieve superior transfer performance. Specifically, we build our framework upon FLUX-Kontext, using the source image as its native conditional input. Furthermore, we introduce RefLoRAInjector, a lightweight makeup feature injector that decouples the reference pathway from the backbone, enabling efficient and comprehensive extraction of makeup-related information. In parallel, we design a robust and scalable data generation pipeline to provide more accurate supervision during training. The paired makeup datasets produced by this pipeline significantly surpass the quality of all existing datasets. Extensive experiments demonstrate that FLUX-Makeup achieves state-of-the-art performance, exhibiting strong robustness across diverse scenarios.
Problem

Research questions and friction points this paper is trying to address.

Achieves high-fidelity makeup transfer without auxiliary face-control components
Ensures identity consistency in makeup transfer using source-reference pairs
Introduces robust data generation for superior training supervision
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion Transformer for makeup transfer
RefLoRAInjector decouples reference pathway
Robust data generation pipeline
🔎 Similar Papers
No similar papers found.
J
Jian Zhu
Nanjing University of Science and Technology
S
Shanyuan Liu
360 AI Research
L
Liuzhuozheng Li
360 AI Research
Y
Yue Gong
Beijing University of Aeronautics and Astronautics
H
He Wang
Nanjing University of Science and Technology
B
Bo Cheng
360 AI Research
Yuhang Ma
Yuhang Ma
Bytedance, University College London
Generative AIMulti-module Pretraining(Conditional) Text-to-image Generation (AIGC)
L
Liebucha Wu
360 AI Research
Xiaoyu Wu
Xiaoyu Wu
Central University of Finance and Economics
development economicslabor economicshealth economics
Dawei Leng
Dawei Leng
Dr.
Multimodal UnderstandingMultimodal GenerationVision and Language
Y
Yuhui Yin
360 AI Research
Y
Yang Xu
Nanjing University of Science and Technology