StereoCarla: A High-Fidelity Driving Dataset for Generalizable Stereo

📅 2025-09-16

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

Existing stereo matching models suffer from limited generalization across diverse camera configurations and complex environments due to insufficient training data diversity. To address this, we introduce the first high-fidelity synthetic driving dataset explicitly designed for generalizable stereo matching, generated using the CARLA simulator. It encompasses stereo image pairs with variable baselines, heterogeneous sensor layouts, and diverse illumination conditions, weather scenarios, and road geometries. The dataset enables rigorous cross-domain evaluation and significantly enhances model adaptability to real-world settings. Our method achieves state-of-the-art generalization performance on established benchmarks—including KITTI, Middlebury, and ETH3D—and further improves absolute accuracy when integrated into multi-dataset training pipelines. The core contributions are: (1) the construction of the first high-fidelity, generalization-oriented synthetic stereo vision dataset; and (2) empirical validation of its substantial gains in cross-domain robustness and transferability.

Technology Category

Application Category

📝 Abstract

Stereo matching plays a crucial role in enabling depth perception for autonomous driving and robotics. While recent years have witnessed remarkable progress in stereo matching algorithms, largely driven by learning-based methods and synthetic datasets, the generalization performance of these models remains constrained by the limited diversity of existing training data. To address these challenges, we present StereoCarla, a high-fidelity synthetic stereo dataset specifically designed for autonomous driving scenarios. Built on the CARLA simulator, StereoCarla incorporates a wide range of camera configurations, including diverse baselines, viewpoints, and sensor placements as well as varied environmental conditions such as lighting changes, weather effects, and road geometries. We conduct comprehensive cross-domain experiments across four standard evaluation datasets (KITTI2012, KITTI2015, Middlebury, ETH3D) and demonstrate that models trained on StereoCarla outperform those trained on 11 existing stereo datasets in terms of generalization accuracy across multiple benchmarks. Furthermore, when integrated into multi-dataset training, StereoCarla contributes substantial improvements to generalization accuracy, highlighting its compatibility and scalability. This dataset provides a valuable benchmark for developing and evaluating stereo algorithms under realistic, diverse, and controllable settings, facilitating more robust depth perception systems for autonomous vehicles. Code can be available at https://github.com/XiandaGuo/OpenStereo, and data can be available at https://xiandaguo.net/StereoCarla.

Problem

Research questions and friction points this paper is trying to address.

Addresses limited stereo dataset diversity for autonomous driving

Improves generalization of stereo matching algorithms

Provides high-fidelity synthetic data with varied conditions

Innovation

Methods, ideas, or system contributions that make the work stand out.

High-fidelity synthetic stereo dataset

Diverse camera configurations and environments

Improves generalization accuracy across benchmarks

🔎 Similar Papers

A New Dataset for Monocular Depth Estimation Under Viewpoint Shifts