🤖 AI Summary
Existing style datasets commonly suffer from intra-style inconsistency, insufficient inter-style diversity, and low sample quality, which hinder effective style representation and transfer. To address these limitations, this work presents MegaStyle-1.4M, the first large-scale, high-quality style dataset systematically constructed via consistent text-to-image style mapping using large generative models, leveraging a balanced combination of 170K style prompts and 400K content prompts. Building upon this dataset, the authors introduce a style-supervised contrastive learning framework and train both the MegaStyle-Encoder and the MegaStyle-FLUX transfer model based on the FLUX architecture. Experiments demonstrate that the encoder enables reliable style similarity measurement, while the transfer model exhibits strong generalization capabilities, collectively validating the critical role of high-quality style data in downstream tasks.
📝 Abstract
In this paper, we introduce MegaStyle, a novel and scalable data curation pipeline that constructs an intra-style consistent, inter-style diverse and high-quality style dataset. We achieve this by leveraging the consistent text-to-image style mapping capability of current large generative models, which can generate images in the same style from a given style description. Building on this foundation, we curate a diverse and balanced prompt gallery with 170K style prompts and 400K content prompts, and generate a large-scale style dataset MegaStyle-1.4M via content-style prompt combinations. With MegaStyle-1.4M, we propose style-supervised contrastive learning to fine-tune a style encoder MegaStyle-Encoder for extracting expressive, style-specific representations, and we also train a FLUX-based style transfer model MegaStyle-FLUX. Extensive experiments demonstrate the importance of maintaining intra-style consistency, inter-style diversity and high-quality for style dataset, as well as the effectiveness of the proposed MegaStyle-1.4M. Moreover, when trained on MegaStyle-1.4M, MegaStyle-Encoder and MegaStyle-FLUX provide reliable style similarity measurement and generalizable style transfer, making a significant contribution to the style transfer community. More results are available at our project website https://jeoyal.github.io/MegaStyle/.