CustomTex: High-fidelity Indoor Scene Texturing via Multi-Reference Customization

📅 2026-03-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods for high-fidelity, customizable indoor scene texture generation often fall short in enabling fine-grained, instance-level control, frequently suffering from low visual quality, artifacts, and entanglement with baked-in lighting. To address these limitations, this work proposes CustomTex, a framework that leverages multiple reference images and introduces a dual distillation mechanism—operating at both semantic and pixel levels—within a Variational Score Distillation (VSD) optimization framework. By incorporating instance-aware cross-attention, CustomTex achieves precise alignment between reference images and 3D scene instances. This approach substantially enhances instance-wise texture consistency, sharpness, and disentanglement from illumination, effectively suppressing artifacts while enabling high-quality, user-friendly appearance customization at the object instance level.

Technology Category

Application Category

📝 Abstract
The creation of high-fidelity, customizable 3D indoor scene textures remains a significant challenge. While text-driven methods offer flexibility, they lack the precision for fine-grained, instance-level control, and often produce textures with insufficient quality, artifacts, and baked-in shading. To overcome these limitations, we introduce CustomTex, a novel framework for instance-level, high-fidelity scene texturing driven by reference images. CustomTex takes an untextured 3D scene and a set of reference images specifying the desired appearance for each object instance, and generates a unified, high-resolution texture map. The core of our method is a dual-distillation approach that separates semantic control from pixel-level enhancement. We employ semantic-level distillation, equipped with an instance cross-attention, to ensure semantic plausibility and ``reference-instance'' alignment, and pixel-level distillation to enforce high visual fidelity. Both are unified within a Variational Score Distillation (VSD) optimization framework. Experiments demonstrate that CustomTex achieves precise instance-level consistency with reference images and produces textures with superior sharpness, reduced artifacts, and minimal baked-in shading compared to state-of-the-art methods. Our work establishes a more direct and user-friendly path to high-quality, customizable 3D scene appearance editing.
Problem

Research questions and friction points this paper is trying to address.

high-fidelity texturing
instance-level control
3D indoor scene
customizable texture
reference-driven synthesis
Innovation

Methods, ideas, or system contributions that make the work stand out.

CustomTex
instance-level texturing
dual-distillation
Variational Score Distillation
reference-driven synthesis
🔎 Similar Papers
No similar papers found.
Weilin Chen
Weilin Chen
Ph.D. from Guangdong University of Technology
causal inferencecausality and its applications
J
Jiahao Rao
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University
W
Wenhao Wang
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University
X
Xinyang Li
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University
Xuan Cheng
Xuan Cheng
School of Informatics, Xiamen University, China
Computer Graphics3D VisionMultimeida
L
Liujuan Cao
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University