🤖 AI Summary
This work addresses the limitations of existing procedural material generation methods—namely, their heavy reliance on expert knowledge and poor editability. We propose the first end-to-end learning framework that compiles procedural material node graphs into executable Python programs and fine-tunes a vision-language model (VLM) on this representation. To support training, we introduce an open-source procedural material dataset and a novel LLM-driven, program-level data augmentation strategy that ensures semantic consistency during code expansion. Our method enables direct synthesis of high-fidelity, editable, and reusable node graphs from a single input image. Extensive evaluation on both synthetic and real-world images demonstrates significant improvements over state-of-the-art approaches across three key dimensions: material fidelity, structural validity, and editing flexibility—establishing new SOTA performance.
📝 Abstract
Procedural materials, represented as functional node graphs, are ubiquitous in computer graphics for photorealistic material appearance design. They allow users to perform intuitive and precise editing to achieve desired visual appearances. However, creating a procedural material given an input image requires professional knowledge and significant effort. In this work, we leverage the ability to convert procedural materials into standard Python programs and fine-tune a large pre-trained vision-language model (VLM) to generate such programs from input images. To enable effective fine-tuning, we also contribute an open-source procedural material dataset and propose to perform program-level augmentation by prompting another pre-trained large language model (LLM). Through extensive evaluation, we show that our method outperforms previous methods on both synthetic and real-world examples.