MAD: Makeup All-in-One with Cross-Domain Diffusion Model

📅 2025-04-03

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Existing makeup generation methods rely on multi-model pipelines and cross-domain feature alignment, lacking text-driven zero-shot virtual try-on capabilities. This paper proposes the first unified single-model framework that jointly supports multiple image editing tasks—including beautification filtering, makeup transfer, and makeup removal—as well as text-guided makeup synthesis, all in an end-to-end manner. Key contributions include: (1) a cross-domain diffusion architecture with switchable domain embeddings for flexible task control; (2) MT-Text, the first fine-grained annotated makeup dataset with aligned text-image pairs, enabling robust text-image alignment training; and (3) data augmentation and joint optimization strategies achieving state-of-the-art performance across all tasks. Our approach significantly reduces deployment complexity and, for the first time, enables zero-shot, text-controllable, and multi-task-compatible makeup editing.

Technology Category

Application Category

📝 Abstract

Existing makeup techniques often require designing multiple models to handle different inputs and align features across domains for different makeup tasks, e.g., beauty filter, makeup transfer, and makeup removal, leading to increased complexity. Another limitation is the absence of text-guided makeup try-on, which is more user-friendly without needing reference images. In this study, we make the first attempt to use a single model for various makeup tasks. Specifically, we formulate different makeup tasks as cross-domain translations and leverage a cross-domain diffusion model to accomplish all tasks. Unlike existing methods that rely on separate encoder-decoder configurations or cycle-based mechanisms, we propose using different domain embeddings to facilitate domain control. This allows for seamless domain switching by merely changing embeddings with a single model, thereby reducing the reliance on additional modules for different tasks. Moreover, to support precise text-to-makeup applications, we introduce the MT-Text dataset by extending the MT dataset with textual annotations, advancing the practicality of makeup technologies.

Problem

Research questions and friction points this paper is trying to address.

Single model for multiple makeup tasks

Text-guided makeup without reference images

Cross-domain embeddings for seamless switching

Innovation

Methods, ideas, or system contributions that make the work stand out.

Single cross-domain diffusion model for multiple tasks

Domain embeddings enable seamless domain switching

MT-Text dataset enhances text-guided makeup applications

🔎 Similar Papers

Stable-Makeup: When Real-World Makeup Transfer Meets Diffusion Model