MatPedia: A Universal Generative Foundation for High-Fidelity Material Synthesis

📅 2025-11-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing PBR material generation methods rely on hand-crafted designs and lack a unified representation, hindering coherent modeling of the relationship between RGB appearance and underlying physical properties—leading to task fragmentation and inability to leverage large-scale RGB image data. This work proposes the first joint RGB-PBR representation framework, encoding appearance and physical attributes as dual latent variable sequences, and models them via a structured 5-frame video diffusion architecture to support text/image-to-material generation and intrinsic decomposition. Trained on our newly constructed hybrid dataset MatHybrid-410K—which integrates massive RGB images with high-fidelity PBR data—the model generates materials at native 1024×1024 resolution. It significantly outperforms prior methods in fidelity, diversity, and multi-task generalization, establishing the first通用 foundation model for industrial-grade material generation.

Technology Category

Application Category

📝 Abstract
Physically-based rendering (PBR) materials are fundamental to photorealistic graphics, yet their creation remains labor-intensive and requires specialized expertise. While generative models have advanced material synthesis, existing methods lack a unified representation bridging natural image appearance and PBR properties, leading to fragmented task-specific pipelines and inability to leverage large-scale RGB image data. We present MatPedia, a foundation model built upon a novel joint RGB-PBR representation that compactly encodes materials into two interdependent latents: one for RGB appearance and one for the four PBR maps encoding complementary physical properties. By formulating them as a 5-frame sequence and employing video diffusion architectures, MatPedia naturally captures their correlations while transferring visual priors from RGB generation models. This joint representation enables a unified framework handling multiple material tasks--text-to-material generation, image-to-material generation, and intrinsic decomposition--within a single architecture. Trained on MatHybrid-410K, a mixed corpus combining PBR datasets with large-scale RGB images, MatPedia achieves native $1024 imes1024$ synthesis that substantially surpasses existing approaches in both quality and diversity.
Problem

Research questions and friction points this paper is trying to address.

Unified representation bridging natural images and PBR material properties
Single framework for multiple material synthesis tasks
Overcoming fragmented pipelines by leveraging large-scale RGB data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Joint RGB-PBR representation encoding materials
Video diffusion architecture for correlation capture
Unified framework handling multiple material tasks
🔎 Similar Papers
No similar papers found.
D
Di Luo
NanKai University
Shuhui Yang
Shuhui Yang
Tencent
Mingxin Yang
Mingxin Yang
Peking University
Cyber SecurityLLMEncrypted Traffic Analysis
J
Jiawei Lu
NanKai University
Y
Yixuan Tang
Tencent Hunyuan, Xi’an Jiaotong University
Xintong Han
Xintong Han
Huya Inc
Computer VisionMachine LearningImage Processing
Z
Zhuo Chen
Tencent Hunyuan
B
Beibei Wang
Nanjing University
C
Chunchao Guo
Tencent Hunyuan