π€ AI Summary
Existing cross-domain image translation methods struggle to simultaneously achieve race-specific attribute transfer and identity preservation without reference images: CycleGAN is restricted to two domains; StarGAN lacks fine-grained style control; and StarGANv2/StyleGAN rely on reference images and suffer from insufficient identity consistency. To address this, we propose RaceGANβthe first GAN framework enabling reference-free, multi-racial (Asian/White/Black) facial image translation. RaceGAN explicitly models and clusters race-specific style codes in the latent space via disentangled representation learning and adaptive instance normalization, enabling fine-grained stylistic control while preserving high-level semantics and individual identity. A classification feedback mechanism and cycle-consistency constraints are jointly optimized. On the Chicago Face Dataset, RaceGAN significantly outperforms state-of-the-art methods. InceptionResNetv2-based evaluation confirms improved racial translation accuracy, and t-SNE visualizations demonstrate clear separability of racial features in the learned latent space.
π Abstract
Generative adversarial networks (GANs) have demonstrated significant progress in unpaired image-to-image translation in recent years for several applications. CycleGAN was the first to lead the way, although it was restricted to a pair of domains. StarGAN overcame this constraint by tackling image-to-image translation across various domains, although it was not able to map in-depth low-level style changes for these domains. Style mapping via reference-guided image synthesis has been made possible by the innovations of StarGANv2 and StyleGAN. However, these models do not maintain individuality and need an extra reference image in addition to the input. Our study aims to translate racial traits by means of multi-domain image-to-image translation. We present RaceGAN, a novel framework capable of mapping style codes over several domains during racial attribute translation while maintaining individuality and high level semantics without relying on a reference image. RaceGAN outperforms other models in translating racial features (i.e., Asian, White, and Black) when tested on Chicago Face Dataset. We also give quantitative findings utilizing InceptionReNetv2-based classification to demonstrate the effectiveness of our racial translation. Moreover, we investigate how well the model partitions the latent space into distinct clusters of faces for each ethnic group.