SafeAug: Safety-Critical Driving Data Augmentation from Naturalistic Datasets

📅 2025-01-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient safety validation of autonomous driving models due to the scarcity of hazardous scenarios in real-world driving data, this paper proposes a natural-data-driven, safety-critical scenario augmentation method with precise controllability. For the first time, our approach enables accurate modeling and controllable synthesis of typical hazardous scenarios—such as emergency close-approach and high-risk lane changes—while preserving image realism. This is achieved by integrating YOLOv5-based object detection, monocular depth estimation, and rigid-body 3D spatial transformations constrained by vehicle dynamics. Augmented data generated on the KITTI dataset significantly improves downstream model performance, boosting hazardous scenario recognition accuracy by +12.7% over baseline methods—including SMOGN and importance sampling—demonstrating superior efficacy. Our framework establishes a novel paradigm for low-cost, high-fidelity generation of safety-critical training data.

Technology Category

Application Category

📝 Abstract
Safety-critical driving data is crucial for developing safe and trustworthy self-driving algorithms. Due to the scarcity of safety-critical data in naturalistic datasets, current approaches primarily utilize simulated or artificially generated images. However, there remains a gap in authenticity between these generated images and naturalistic ones. We propose a novel framework to augment the safety-critical driving data from the naturalistic dataset to address this issue. In this framework, we first detect vehicles using YOLOv5, followed by depth estimation and 3D transformation to simulate vehicle proximity and critical driving scenarios better. This allows for targeted modification of vehicle dynamics data to reflect potentially hazardous situations. Compared to the simulated or artificially generated data, our augmentation methods can generate safety-critical driving data with minimal compromise on image authenticity. Experiments using KITTI datasets demonstrate that a downstream self-driving algorithm trained on this augmented dataset performs superiorly compared to the baselines, which include SMOGN and importance sampling.
Problem

Research questions and friction points this paper is trying to address.

Autonomous vehicle safety
Driving data extraction
Hazardous driving situations generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

SafeAug
YOLOv5
Adversarial Driving Scenarios
🔎 Similar Papers
No similar papers found.
Zhaobin Mo
Zhaobin Mo
Columbia University
Physics-informed Deep LearningGenerative Adversarial NetworksReinforcement Learning
Y
Yunlong Li
Department of Electrical Engineering, Columbia University, 500 West 120th Street, New York, NY 10025, USA
X
Xuan Di
Department of Civil Engineering and Engineering Mechanics, Columbia University, 500 West 120th Street, New York, NY 10025, USA; Data Science Institute, Columbia University, 550 W 120th St, New York, NY 10027, USA