🤖 AI Summary
Existing methods generate 3D textures via hand-crafted back-projection and multi-view image averaging, often introducing seams and geometric distortions. This paper proposes a neural back-projection framework that directly maps 2D images—synthesized by multi-view diffusion models—onto 3D surface texture space. Our key contributions are twofold: (i) the first introduction of a 3D-aware positional encoding integrating 3D coordinates, surface normals, and geodesic distance; and (ii) a surface-adaptive neural attention module that replaces conventional fusion strategies, enabling geometrically consistent inter-view texture alignment and continuous interpolation over the surface. Experiments demonstrate state-of-the-art performance in high-resolution texture synthesis, with significant improvements in texture continuity, detail fidelity, and geometric alignment accuracy—effectively mitigating seams and visual artifacts.
📝 Abstract
We present Im2SurfTex, a method that generates textures for input 3D shapes by learning to aggregate multi-view image outputs produced by 2D image diffusion models onto the shapes' texture space. Unlike existing texture generation techniques that use ad hoc backprojection and averaging schemes to blend multiview images into textures, often resulting in texture seams and artifacts, our approach employs a trained, feedforward neural module to boost texture coherency. The key ingredient of our module is to leverage neural attention and appropriate positional encodings of image pixels based on their corresponding 3D point positions, normals, and surface-aware coordinates as encoded in geodesic distances within surface patches. These encodings capture texture correlations between neighboring surface points, ensuring better texture continuity. Experimental results show that our module improves texture quality, achieving superior performance in high-resolution texture generation.