🤖 AI Summary
This work addresses the challenge of incomplete 3D geometry and appearance reconstruction from sparse, fixed-view observations by leveraging natural object motions induced by human manipulation. The method transforms static camera views into virtual circumferential perspectives in the object’s local coordinate frame through joint optimization of 6-DoF object poses and geometry. A spherical harmonics-based directional reflectance probing model is introduced to effectively disentangle diffuse and specular reflectance components. Geometry is represented using 2D Gaussian splatting, and an alternating minimization strategy is employed to jointly refine motion trajectories and appearance parameters. Evaluated on both synthetic and real-world datasets with sparse viewpoints, the proposed approach significantly outperforms existing methods, achieving higher-fidelity reconstruction of both geometry and appearance.
📝 Abstract
Reconstructing 3D geometry and appearance from a sparse set of fixed cameras is a foundational task with broad applications, yet it remains fundamentally constrained by the limited viewpoints. We show that this bound can be broken by exploiting opportunistic object motion: as a person manipulates an object~(e.g., moving a chair or lifting a mug), the static cameras effectively ``orbit'' the object in its local coordinate frame, providing additional virtual viewpoints. Harnessing this object motion, however, poses two challenges: the tight coupling of object pose and geometry estimation and the complex appearance variations of a moving object under static illumination. We address these by formulating a joint pose and shape optimization using 2D Gaussian splatting with alternating minimization of 6DoF trajectories and primitive parameters, and by introducing a novel appearance model that factorizes diffuse and specular components with reflected directional probing within the spherical harmonics space. Extensive experiments on synthetic and real-world datasets with extremely sparse viewpoints demonstrate that our method recovers significantly more accurate geometry and appearance than state-of-the-art baselines.