H-OmniStereo: Zero-Shot Omnidirectional Stereo Matching with Heading-Aligned Normal Priors

📅 2026-05-14
📈 Citations: 0
Influential: 0
📄 PDF

career value

225K/year
🤖 AI Summary
This work addresses the failure of monocular priors in panoramic stereo matching caused by data scarcity and spherical distortion. To overcome these challenges, the authors construct a large-scale synthetic panoramic stereo dataset and propose a monocular normal estimator operating in a heading-aligned coordinate system, which provides geometric priors under zero-shot settings. By leveraging heading-aligned normal priors, the method achieves cross-view consistency and robustness to distortion, effectively mitigating performance degradation due to mismatches between training and testing fields of view. Experiments demonstrate that the model outperforms existing approaches on out-of-domain datasets and successfully generalizes to real-world consumer-grade panoramic camera systems. The code and dataset will be publicly released.
📝 Abstract
Stereo matching on top-bottom equirectangular images provides an effective framework for full-surround perception, as vertically aligned epipolar lines enable the use of advanced perspective stereo architectures that are largely driven by large-scale datasets and monocular priors. However, the performance of such adaptations is severely limited by the scarcity of omnidirectional stereo datasets and the degradation of perspective monocular priors under spherical distortions.To address these challenges, we propose H-OmniStereo, a zero-shot omnidirectional stereo matching framework. First, we construct high-quality synthetic dataset comprising over 2.8 million top-bottom equirectangular stereo pairs to scale up training. Second, we introduce an equirectangular monocular normal estimator, specifically operating in a heading-aligned coordinate system. Beyond providing distortion-robust and cross-view-consistent geometric priors for establishing reliable correspondences in stereo matching, this design boosts training efficiency and accommodates train-test FoV mismatches.Extensive experiments show that our approach achieves higher accuracy than existing methods on out-of-domain datasets and successfully generalizes to real-world consumer camera setups using a single model. Both the model and the dataset will be open-sourced.
Problem

Research questions and friction points this paper is trying to address.

omnidirectional stereo matching
equirectangular images
monocular priors
spherical distortion
stereo correspondence
Innovation

Methods, ideas, or system contributions that make the work stand out.

zero-shot
omnidirectional stereo matching
heading-aligned normals
equirectangular images
synthetic dataset
🔎 Similar Papers
No similar papers found.
C
Chenxing Jiang
Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
Z
Zhe Tong
Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
P
Pusen Gao
Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
P
Peize Liu
Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
Y
Yang Xu
Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong, China
Chuan Fang
Chuan Fang
HKUST, Electronic and Computer Engineering Department
RoboticsComputer visionVisual reconstruction
Ping Tan
Ping Tan
Hong Kong University of Science and Technology (HKUST)
Computer VisionComputer Graphics
Shaojie Shen
Shaojie Shen
Associate Professor, Hong Kong University of Science and Technology
Robotics