🤖 AI Summary
Traditional active stereo vision relies on physical structured-light projectors, limiting operational range and robustness under varying ambient illumination. To address this, we propose the virtual active stereo paradigm: replacing physical projectors with a depth sensor to generate geometrically consistent virtual textures directly on stereo image pairs in real time, leveraging the sensor’s sparse geometric priors—without modifying stereo matching algorithms or retraining models. Our core contribution is a zero-shot, learning-free cross-modal alignment mechanism that enables geometry-driven rendering of virtual patterns from depth to images. This forms a plug-and-play, sensor-agnostic fusion framework compatible with any depth sensor. Evaluated on multi-scale indoor and outdoor datasets, our method significantly improves both conventional and deep-learning-based stereo algorithms in accuracy and robustness, achieving state-of-the-art performance on active stereo benchmarks—even when using raw, unfiltered depth input.
📝 Abstract
This paper presents a novel general-purpose stereo and depth data fusion paradigm that mimics the active stereo principle by replacing the unreliable physical pattern projector with a depth sensor. It works by projecting virtual patterns consistent with the scene geometry onto the left and right images acquired by a conventional stereo camera, using the sparse hints obtained from a depth sensor, to facilitate the visual correspondence. Purposely, any depth sensing device can be seamlessly plugged into our framework, enabling the deployment of a virtual active stereo setup in any possible environment and overcoming the severe limitations of physical pattern projection, such as the limited working range and environmental conditions. Exhaustive experiments on indoor and outdoor datasets featuring both long and close range, including those providing raw, unfiltered depth hints from off-the-shelf depth sensors, highlight the effectiveness of our approach in notably boosting the robustness and accuracy of algorithms and deep stereo without any code modification and even without re-training. Additionally, we assess the performance of our strategy on active stereo evaluation datasets with conventional pattern projection. Indeed, in all these scenarios, our virtual pattern projection paradigm achieves state-of-the-art performance. The source code is available at: https://github.com/bartn8/vppstereo.