π€ AI Summary
This work addresses three key challenges in PBR material map generation for textureless 3D meshes: slow inference, global texture inconsistency, and limited multi-view resolution. We propose an efficient multi-view joint generation framework. Our core contributions are: (1) a view-packing technique that formulates multi-view renderings as a 2D rectangular bin-packing problem, significantly increasing per-view effective resolution without computational overhead; (2) a fine-grained autoregressive backbone integrating cross-view spatial-preserving encoding and multi-domain conditional control (text/image/geometry) to jointly synthesize high-fidelity, globally consistent PBR mapsβalbedo, normal, roughness, and metallic; and (3) a packed-view representation fully compatible with mainstream 2D diffusion models. Experiments demonstrate state-of-the-art performance across texture quality, training/inference efficiency, and supported resolution.
π Abstract
We present PacTure, a novel framework for generating physically-based rendering (PBR) material textures from an untextured 3D mesh, a text description, and an optional image prompt. Early 2D generation-based texturing approaches generate textures sequentially from different views, resulting in long inference times and globally inconsistent textures. More recent approaches adopt multi-view generation with cross-view attention to enhance global consistency, which, however, limits the resolution for each view. In response to these weaknesses, we first introduce view packing, a novel technique that significantly increases the effective resolution for each view during multi-view generation without imposing additional inference cost, by formulating the arrangement of multi-view maps as a 2D rectangle bin packing problem. In contrast to UV mapping, it preserves the spatial proximity essential for image generation and maintains full compatibility with current 2D generative models. To further reduce the inference cost, we enable fine-grained control and multi-domain generation within the next-scale prediction autoregressive framework to create an efficient multi-view multi-domain generative backbone. Extensive experiments show that PacTure outperforms state-of-the-art methods in both quality of generated PBR textures and efficiency in training and inference.