MatLat: Material Latent Space for PBR Texture Generation

πŸ“… 2025-12-19
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the scarcity of PBR texture data, distribution shift caused by freezing embedding networks in existing methods, and critical challenges including cross-view inconsistency and misalignment between latent and pixel spaces, this paper proposes MatLatβ€”a material-aware latent space framework. MatLat fine-tunes a pre-trained VAE to construct a multi-channel material encoder and jointly optimizes the diffusion-based generation process. It introduces correspondence-aware perceptual attention and a locality-preserving patch alignment regularization in latent space, explicitly enforcing cross-view consistency and mitigating distribution shift. Experiments demonstrate that MatLat significantly outperforms state-of-the-art methods in PBR texture fidelity, achieving consistent improvements in SSIM, LPIPS, and material physical plausibility metrics. Ablation studies validate the essential contribution of each component to overall performance.

Technology Category

Application Category

πŸ“ Abstract
We propose a generative framework for producing high-quality PBR textures on a given 3D mesh. As large-scale PBR texture datasets are scarce, our approach focuses on effectively leveraging the embedding space and diffusion priors of pretrained latent image generative models while learning a material latent space, MatLat, through targeted fine-tuning. Unlike prior methods that freeze the embedding network and thus lead to distribution shifts when encoding additional PBR channels and hinder subsequent diffusion training, we fine-tune the pretrained VAE so that new material channels can be incorporated with minimal latent distribution deviation. We further show that correspondence-aware attention alone is insufficient for cross-view consistency unless the latent-to-image mapping preserves locality. To enforce this locality, we introduce a regularization in the VAE fine-tuning that crops latent patches, decodes them, and aligns the corresponding image regions to maintain strong pixel-latent spatial correspondence. Ablation studies and comparison with previous baselines demonstrate that our framework improves PBR texture fidelity and that each component is critical for achieving state-of-the-art performance.
Problem

Research questions and friction points this paper is trying to address.

Generates high-quality PBR textures for 3D meshes
Addresses scarcity of large-scale PBR texture datasets
Mitigates distribution shifts when adding material channels
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tunes pretrained VAE for minimal latent distribution deviation
Introduces latent patch cropping regularization for spatial correspondence
Leverages embedding space and diffusion priors of latent image models
πŸ”Ž Similar Papers
No similar papers found.