RobuSTereo: Robust Zero-Shot Stereo Matching under Adverse Weather

📅 2025-07-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing learning-based stereo matching models exhibit poor zero-shot generalization under adverse weather conditions, primarily due to the scarcity of real-world degraded stereo data and insufficient feature discriminability. To address this, we propose a robust stereo matching framework for zero-shot domain adaptation. First, we introduce a diffusion-model-driven stereo-consistent weather simulation framework that synthesizes physically plausible, structurally aligned stereo image pairs under rain, fog, and snow. Second, we design a hybrid ConvNet-Transformer robust encoder that jointly leverages local detail modeling and global denoising capabilities, enhancing invariance to noise, low contrast, and blur. Experiments demonstrate substantial improvements in disparity estimation accuracy across unseen adverse weather conditions. Our method achieves state-of-the-art robustness in depth estimation, outperforming existing approaches on multiple benchmarks. This work provides a reliable zero-shot stereo perception solution for safety-critical applications such as autonomous driving.

Technology Category

Application Category

📝 Abstract
Learning-based stereo matching models struggle in adverse weather conditions due to the scarcity of corresponding training data and the challenges in extracting discriminative features from degraded images. These limitations significantly hinder zero-shot generalization to out-of-distribution weather conditions. In this paper, we propose extbf{RobuSTereo}, a novel framework that enhances the zero-shot generalization of stereo matching models under adverse weather by addressing both data scarcity and feature extraction challenges. First, we introduce a diffusion-based simulation pipeline with a stereo consistency module, which generates high-quality stereo data tailored for adverse conditions. By training stereo matching models on our synthetic datasets, we reduce the domain gap between clean and degraded images, significantly improving the models' robustness to unseen weather conditions. The stereo consistency module ensures structural alignment across synthesized image pairs, preserving geometric integrity and enhancing depth estimation accuracy. Second, we design a robust feature encoder that combines a specialized ConvNet with a denoising transformer to extract stable and reliable features from degraded images. The ConvNet captures fine-grained local structures, while the denoising transformer refines global representations, effectively mitigating the impact of noise, low visibility, and weather-induced distortions. This enables more accurate disparity estimation even under challenging visual conditions. Extensive experiments demonstrate that extbf{RobuSTereo} significantly improves the robustness and generalization of stereo matching models across diverse adverse weather scenarios.
Problem

Research questions and friction points this paper is trying to address.

Enhancing zero-shot stereo matching in adverse weather
Overcoming data scarcity for weather-degraded stereo images
Improving feature extraction from noisy, low-visibility conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion-based simulation pipeline with stereo consistency
Specialized ConvNet and denoising transformer encoder
Synthetic datasets for adverse weather robustness
🔎 Similar Papers
No similar papers found.
Y
Yuran Wang
Beijing Institute of Technology
Yingping Liang
Yingping Liang
2022~2028 Ph.D. student at Beijing Institute of Technology
3D Vision
Y
Yutao Hu
School of Computer Science and Engineering, Southeast University
Y
Ying Fu
Beijing Institute of Technology