OMGTex: One-stage Multi-style Facial Texture Reconstruction without Geometry Guidance

📅 2026-05-25

📈 Citations: 0

✨ Influential: 0

career value

173K/year

🤖 AI Summary

Existing facial texture reconstruction methods rely on unstable 3D geometric priors and lack semantic disentanglement, hindering region-wise editing and multi-style transfer. This work proposes the first end-to-end diffusion framework that generates high-quality, editable UV textures directly from multi-style face images without geometric guidance. The approach employs a single-stage reconstruction pipeline and gradient-guided inference optimization to mitigate UV misalignment, while a novel semantic distribution-aware training strategy enhances editability. To support this research, we introduce CANVAS, the first multi-style paired texture dataset. Extensive experiments demonstrate state-of-the-art performance across multiple benchmarks, significantly improving cross-domain robustness, style consistency, and semantic editability.

📝 Abstract

We propose OMGTex, an end-to-end diffusion-based framework for reconstructing high-quality and editable facial UV textures from multi-style facial images. Existing texture reconstruction methods face two major limitations: (1) Fragility due to reliance on 3D geometry priors, which are difficult to estimate accurately, especially under facial occlusions or in stylized domains; and (2) A lack of semantic disentanglement, inhibiting region-specific texture editing and style transfer. Our work addresses both challenges simultaneously. Our core innovation is a geometry-free pipeline that directly maps a 2D face image to its corresponding editable UV texture. We introduce two key techniques: First, to address the challenge of UV misalignment common in diffusion generation, we introduce a gradient-guided refinement strategy at inference time, which explicitly corrects structural consistency. Second, we leverage the inherent semantic distribution capability of diffusion models and design a novel training paradigm to enhance this tendency, enabling semantic-aware editing of facial texture. Furthermore, to address the data scarcity in multi-style texture reconstruction, we construct CANVAS, the first comprehensive paired texture reconstruction dataset covering realistic and diverse stylized domains. To the best of our knowledge, OMGTex is the first geometry-free inference framework that achieves robust, style-consistent, and editable facial texture reconstruction across diverse domains. Our method achieves state-of-the-art performance on multiple facial texture benchmarks.

Problem

Research questions and friction points this paper is trying to address.

facial texture reconstruction

geometry-free

semantic disentanglement

multi-style

UV texture

Innovation

Methods, ideas, or system contributions that make the work stand out.

geometry-free

diffusion-based texture reconstruction

semantic-aware editing