TexHOI: Reconstructing Textures of 3D Unknown Objects in Monocular Hand-Object Interaction Scenes

📅 2025-01-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of high-fidelity 3D object texture reconstruction from monocular video in hand–object interaction scenes, where occlusion by hands, indirect illumination, and pose estimation errors cause severe texture distortion. We propose the first end-to-end framework for photorealistic texture reconstruction. Methodologically, we explicitly model hand-induced occlusion to estimate object visibility, and jointly optimize skin reflectance properties and environment lighting to disentangle hand–object physical interactions. Our approach integrates composite radiance field rendering, differentiable physically based rendering, and multi-stage geometry–texture co-optimization, enabling joint hand–object pose estimation. Experiments demonstrate significant improvements over state-of-the-art methods in albedo, shadow, and specular region recovery accuracy. To our knowledge, this is the first method achieving robust, photorealistic texture reconstruction from a single viewpoint in dynamic hand–object interaction scenarios.

Technology Category

Application Category

📝 Abstract
Reconstructing 3D models of dynamic, real-world objects with high-fidelity textures from monocular frame sequences has been a challenging problem in recent years. This difficulty stems from factors such as shadows, indirect illumination, and inaccurate object-pose estimations due to occluding hand-object interactions. To address these challenges, we propose a novel approach that predicts the hand's impact on environmental visibility and indirect illumination on the object's surface albedo. Our method first learns the geometry and low-fidelity texture of the object, hand, and background through composite rendering of radiance fields. Simultaneously, we optimize the hand and object poses to achieve accurate object-pose estimations. We then refine physics-based rendering parameters - including roughness, specularity, albedo, hand visibility, skin color reflections, and environmental illumination - to produce precise albedo, and accurate hand illumination and shadow regions. Our approach surpasses state-of-the-art methods in texture reconstruction and, to the best of our knowledge, is the first to account for hand-object interactions in object texture reconstruction.
Problem

Research questions and friction points this paper is trying to address.

3D Reconstruction
Single View
Pose Estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

3D Reconstruction
Hand-Object Interaction
Texture Rendering
🔎 Similar Papers
No similar papers found.