LouvreSAE: Sparse Autoencoders for Interpretable and Controllable Style Transfer

📅 2025-12-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing artistic style transfer methods rely on fine-tuning, adapters, or prompt engineering—entailing high computational overhead and entangling style with content. This paper proposes LouvreSAE, a lightweight, interpretable sparse autoencoder (SAE) operating in the latent space of generative models. Without requiring additional training or inference-time modifications, LouvreSAE unsupervisedly disentangles stylistic concepts—including brushstrokes, textures, and color palettes—from structural content in artwork. It introduces the first art-specific SAE architecture, yielding human-interpretable, fully decomposable style contour vectors. Evaluated on ArtBench10, LouvreSAE achieves state-of-the-art style fidelity (measured by VGG Style Loss and CLIP Score Style), accelerates inference by 1.7–20× over baselines, and enables precise, plug-and-play style transfer using only a few reference images.

Technology Category

Application Category

📝 Abstract
Artistic style transfer in generative models remains a significant challenge, as existing methods often introduce style only via model fine-tuning, additional adapters, or prompt engineering, all of which can be computationally expensive and may still entangle style with subject matter. In this paper, we introduce a training- and inference-light, interpretable method for representing and transferring artistic style. Our approach leverages an art-specific Sparse Autoencoder (SAE) on top of latent embeddings of generative image models. Trained on artistic data, our SAE learns an emergent, largely disentangled set of stylistic and compositional concepts, corresponding to style-related elements pertaining brushwork, texture, and color palette, as well as semantic and structural concepts. We call it LouvreSAE and use it to construct style profiles: compact, decomposable steering vectors that enable style transfer without any model updates or optimization. Unlike prior concept-based style transfer methods, our method requires no fine-tuning, no LoRA training, and no additional inference passes, enabling direct steering of artistic styles from only a few reference images. We validate our method on ArtBench10, achieving or surpassing existing methods on style evaluations (VGG Style Loss and CLIP Score Style) while being 1.7-20x faster and, critically, interpretable.
Problem

Research questions and friction points this paper is trying to address.

Enables interpretable style transfer without model fine-tuning
Learns disentangled stylistic concepts from artistic data
Achieves fast, controllable style steering from few images
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sparse Autoencoder for style concept learning
Compact style profiles without model updates
Interpretable style transfer from few images
🔎 Similar Papers
No similar papers found.
R
Raina Panda
Washington High School
D
Daniel Fein
Stanford University
A
Arpita Singhal
Stanford University
M
Mark Fiore
Stanford University
Maneesh Agrawala
Maneesh Agrawala
Stanford University
GraphicsComputer GraphicsHCIVisualization
Matyas Bohacek
Matyas Bohacek
Stanford University & Google DeepMind
artificial intelligencecomputer visionmedia forensics