LouvreSAE: Sparse Autoencoders for Interpretable and Controllable Style Transfer

📅 2025-12-21

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

Existing artistic style transfer methods rely on fine-tuning, adapters, or prompt engineering—entailing high computational overhead and entangling style with content. This paper proposes LouvreSAE, a lightweight, interpretable sparse autoencoder (SAE) operating in the latent space of generative models. Without requiring additional training or inference-time modifications, LouvreSAE unsupervisedly disentangles stylistic concepts—including brushstrokes, textures, and color palettes—from structural content in artwork. It introduces the first art-specific SAE architecture, yielding human-interpretable, fully decomposable style contour vectors. Evaluated on ArtBench10, LouvreSAE achieves state-of-the-art style fidelity (measured by VGG Style Loss and CLIP Score Style), accelerates inference by 1.7–20× over baselines, and enables precise, plug-and-play style transfer using only a few reference images.

Technology Category

Application Category

📝 Abstract

Artistic style transfer in generative models remains a significant challenge, as existing methods often introduce style only via model fine-tuning, additional adapters, or prompt engineering, all of which can be computationally expensive and may still entangle style with subject matter. In this paper, we introduce a training- and inference-light, interpretable method for representing and transferring artistic style. Our approach leverages an art-specific Sparse Autoencoder (SAE) on top of latent embeddings of generative image models. Trained on artistic data, our SAE learns an emergent, largely disentangled set of stylistic and compositional concepts, corresponding to style-related elements pertaining brushwork, texture, and color palette, as well as semantic and structural concepts. We call it LouvreSAE and use it to construct style profiles: compact, decomposable steering vectors that enable style transfer without any model updates or optimization. Unlike prior concept-based style transfer methods, our method requires no fine-tuning, no LoRA training, and no additional inference passes, enabling direct steering of artistic styles from only a few reference images. We validate our method on ArtBench10, achieving or surpassing existing methods on style evaluations (VGG Style Loss and CLIP Score Style) while being 1.7-20x faster and, critically, interpretable.

Problem

Research questions and friction points this paper is trying to address.

Enables interpretable style transfer without model fine-tuning

Learns disentangled stylistic concepts from artistic data

Achieves fast, controllable style steering from few images

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse Autoencoder for style concept learning

Compact style profiles without model updates

Interpretable style transfer from few images

🔎 Similar Papers

StyleShot: A Snapshot on Any Style