đ¤ AI Summary
Photometric stereo (PS) suffers from inaccurate surface normal estimation in complex real-world scenariosâsuch as shadowed, self-occluded, or bias-illuminated regionsâdue to unreliable multi-light cues. To address this, we propose GeoUniPS: a geometrically unified PS framework featuring a dual-branch (photometricâgeometric) encoder, where a frozen, pre-trained large-scale 3D reconstruction model serves as a visionâgeometry foundation to extract highly generalizable high-level geometric priors. We replace the conventional orthographic projection assumption with a perspective-projection model that explicitly accounts for spatially varying viewpoints. To support normal estimation under perspective imaging, we introduce the PS-Perp dataset. By jointly leveraging synthetic supervision and geometry-aware priors, GeoUniPS achieves significant improvements in normal estimation accuracy and robustness across multiple benchmarksâparticularly excelling in shadowed and self-occluded regionsâsetting new state-of-the-art performance in both quantitative metrics and visual quality.
đ Abstract
Universal Photometric Stereo is a promising approach for recovering surface normals without strict lighting assumptions. However, it struggles when multi-illumination cues are unreliable, such as under biased lighting or in shadows or self-occluded regions of complex in-the-wild scenes. We propose GeoUniPS, a universal photometric stereo network that integrates synthetic supervision with high-level geometric priors from large-scale 3D reconstruction models pretrained on massive in-the-wild data. Our key insight is that these 3D reconstruction models serve as visual-geometry foundation models, inherently encoding rich geometric knowledge of real scenes. To leverage this, we design a Light-Geometry Dual-Branch Encoder that extracts both multi-illumination cues and geometric priors from the frozen 3D reconstruction model. We also address the limitations of the conventional orthographic projection assumption by introducing the PS-Perp dataset with realistic perspective projection to enable learning of spatially varying view directions. Extensive experiments demonstrate that GeoUniPS delivers state-of-the-arts performance across multiple datasets, both quantitatively and qualitatively, especially in the complex in-the-wild scenes.