🤖 AI Summary
This work addresses the challenge of insufficient 3D reconstruction accuracy for complex specular objects under single-shot capture, particularly in dynamic scenes or regions with high curvature and high-frequency geometry. To overcome this limitation, we propose a physics-guided deep learning approach that synergistically integrates polarization imaging with structured-light illumination. Our method employs a dual-encoder architecture to jointly interpret polarization and geometric cues, augmented by a feature cross-modulation mechanism that effectively disentangles their inherent nonlinear coupling—thereby transcending the constraints of conventional orthogonal imaging assumptions. Notably, this is the first framework to embed physical priors into an active polarization-based 3D imaging pipeline, enabling high-fidelity and robust surface normal estimation from a single shot and facilitating real-time reconstruction of intricate specular shapes.
📝 Abstract
3D imaging of specular surfaces remains challenging in real-world scenarios, such as in-line inspection or hand-held scanning, requiring fast and accurate measurement of complex geometries. Optical metrology techniques such as deflectometry achieve high accuracy but typically rely on multi-shot acquisition, making them unsuitable for dynamic environments. Fourier-based single-shot approaches alleviate this constraint, yet their performance deteriorates when measuring surfaces with high spatial frequency structure or large curvature. Alternatively, polarimetric 3D imaging in computer vision operates in a single-shot fashion and exhibits robustness to geometric complexity. However, its accuracy is fundamentally limited by the orthographic imaging assumption. In this paper, we propose a physics-informed deep learning framework for single-shot 3D imaging of complex specular surfaces. Polarization cues provide orientation priors that assist in interpreting geometric information encoded by structured illumination. These complementary cues are processed through a dual-encoder architecture with mutual feature modulation, allowing the network to resolve their nonlinear coupling and directly infer surface normals. The proposed method achieves accurate and robust normal estimation in single-shot with fast inference, enabling practical 3D imaging of complex specular surfaces.