A Controllable Appearance Representation for Flexible Transfer and Editing

📅 2025-04-21

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This work addresses the challenge of learning compact, disentangled, and interpretable material appearance representations to enable geometry-aware appearance transfer and interactive editing. We propose a self-supervised FactorVAE framework that achieves high disentanglement between illumination and material attributes—specifically hue, glossiness, and roughness—using entirely unlabeled data. The learned representation is integrated into a lightweight IP-Adapter to condition diffusion models for appearance generation. To our knowledge, this is the first unsupervised approach enabling fine-grained, semantic-controllable editing (e.g., via intuitive image-space sliders) without requiring ground-truth annotations. Extensive experiments demonstrate that our method outperforms existing supervised and weakly supervised approaches both qualitatively and quantitatively, achieving superior control accuracy while maintaining high-fidelity synthesis.

Technology Category

Application Category

📝 Abstract

We present a method that computes an interpretable representation of material appearance within a highly compact, disentangled latent space. This representation is learned in a self-supervised fashion using an adapted FactorVAE. We train our model with a carefully designed unlabeled dataset, avoiding possible biases induced by human-generated labels. Our model demonstrates strong disentanglement and interpretability by effectively encoding material appearance and illumination, despite the absence of explicit supervision. Then, we use our representation as guidance for training a lightweight IP-Adapter to condition a diffusion pipeline that transfers the appearance of one or more images onto a target geometry, and allows the user to further edit the resulting appearance. Our approach offers fine-grained control over the generated results: thanks to the well-structured compact latent space, users can intuitively manipulate attributes such as hue or glossiness in image space to achieve the desired final appearance.

Problem

Research questions and friction points this paper is trying to address.

Develop interpretable material appearance representation in compact latent space

Enable flexible appearance transfer and editing via diffusion pipeline

Provide intuitive control over attributes like hue and glossiness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised FactorVAE for compact disentangled representation

Lightweight IP-Adapter guides diffusion-based appearance transfer

Intuitive hue/glossiness editing via structured latent space

🔎 Similar Papers

Eye-for-an-eye: Appearance Transfer with Semantic Correspondence in Diffusion Models