🤖 AI Summary
To address the challenge of rapidly generating high-quality, multi-view consistent texture maps for 3D geometry, this paper proposes the first depth-aware progressive diffusion-based texture generation framework. Methodologically: (1) it introduces depth-guided diffusion modeling to explicitly incorporate geometric priors; (2) it designs an automatic viewpoint sequence planning algorithm to optimize texture coverage and inter-view consistency; and (3) it employs a front-face selection mask and back-projection strategy to effectively suppress geometric distortion and interior-face artifacts. Evaluated on a single H100 GPU, the framework achieves end-to-end generation of high-resolution textures in just 3.07 seconds—matching or surpassing state-of-the-art methods in visual quality. This work establishes the first end-to-end 3D texture generation approach that is depth-aware, geometrically robust, and real-time interactive—introducing a new paradigm for efficient, high-fidelity texture authoring in animation and game development.
📝 Abstract
We present Make-A-Texture, a new framework that efficiently synthesizes high-resolution texture maps from textual prompts for given 3D geometries. Our approach progressively generates textures that are consistent across multiple viewpoints with a depth-aware inpainting diffusion model, in an optimized sequence of viewpoints determined by an automatic view selection algorithm. A significant feature of our method is its remarkable efficiency, achieving a full texture generation within an end-to-end runtime of just 3.07 seconds on a single NVIDIA H100 GPU, significantly outperforming existing methods. Such an acceleration is achieved by optimizations in the diffusion model and a specialized backprojection method. Moreover, our method reduces the artifacts in the backprojection phase, by selectively masking out non-frontal faces, and internal faces of open-surfaced objects. Experimental results demonstrate that Make-A-Texture matches or exceeds the quality of other state-of-the-art methods. Our work significantly improves the applicability and practicality of texture generation models for real-world 3D content creation, including interactive creation and text-guided texture editing.