🤖 AI Summary
This work addresses two key limitations of existing image style transfer methods: coarse-grained modeling of photographic styles and insufficient content preservation. We propose a personalized filter learning framework grounded in pre-trained text-to-image diffusion models. Methodologically, we integrate generative priors with textual inversion to optimize photography-related prompt tokens, enabling precise disentanglement and encoding of fine-grained photographic attributes—such as film grain, lighting logic, and tonal rendering—from a single reference image. This facilitates editable, attribute-aware style transfer. Unlike conventional GAN- or autoencoder-based approaches, our method requires neither paired training data nor explicit style annotations. Extensive experiments demonstrate substantial improvements in style fidelity and structural content consistency across diverse real-world photographic style transfer tasks. The framework establishes a new paradigm for controllable, interpretable, and personalized image editing.
📝 Abstract
Photographic style, as a composition of certain photographic concepts, is the charm behind renowned photographers. But learning and transferring photographic style need a profound understanding of how the photo is edited from the unknown original appearance. Previous works either fail to learn meaningful photographic concepts from reference images, or cannot preserve the content of the content image. To tackle these issues, we proposed a Personalized Image Filter (PIF). Based on a pretrained text-to-image diffusion model, the generative prior enables PIF to learn the average appearance of photographic concepts, as well as how to adjust them according to text prompts. PIF then learns the photographic style of reference images with the textual inversion technique, by optimizing the prompts for the photographic concepts. PIF shows outstanding performance in extracting and transferring various kinds of photographic style. Project page: https://pif.pages.dev/