End-to-End Driving via Self-Supervised Imitation Learning Using Camera and LiDAR Data

📅 2023-08-28

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

To address the reliance of end-to-end autonomous driving on large-scale manually annotated control labels and external pre-trained models, this paper proposes SSIL—the first fully self-supervised framework. SSIL eliminates the need for ground-truth steering angle annotations or off-the-shelf pretraining by leveraging only onboard camera and LiDAR data; it generates high-fidelity pseudo-steering labels via precise LiDAR-based ego-pose estimation, enabling self-supervised imitation learning. The method integrates multimodal feature fusion, instruction-conditioned network architecture, and self-supervised regression learning (SSRL). Evaluated on three mainstream benchmarks, SSIL achieves control accuracy competitive with fully supervised methods, while its pseudo-label generator significantly outperforms PID-based baselines. The core contribution is the establishment of the first end-to-end, label-free, pretraining-free self-supervised driving paradigm.

📝 Abstract

In autonomous driving, the end-to-end (E2E) driving approach that predicts vehicle control signals directly from sensor data is rapidly gaining attention. To learn a safe E2E driving system, one needs an extensive amount of driving data and human intervention. Vehicle control data is constructed by many hours of human driving, and it is challenging to construct large vehicle control datasets. Often, publicly available driving datasets are collected with limited driving scenes, and collecting vehicle control data is only available by vehicle manufacturers. To address these challenges, this letter proposes the first fully self-supervised learning framework, self-supervised imitation learning (SSIL), for E2E driving, based on the self-supervised regression learning (SSRL) framework.The proposed SSIL framework can learn E2E driving networks emph{without} using driving command data or a pre-trained model. To construct pseudo steering angle data, proposed SSIL predicts a pseudo target from the vehicle's poses at the current and previous time points that are estimated with light detection and ranging sensors. In addition, we propose two E2E driving networks that predict driving commands depending on high-level instruction. Our numerical experiments with three different benchmark datasets demonstrate that the proposed SSIL framework achieves emph{very} comparable E2E driving accuracy with the supervised learning counterpart. The proposed pseudo-label predictor outperformed an existing one using proportional integral derivative controller.

Problem

Research questions and friction points this paper is trying to address.

Develops self-supervised imitation learning for autonomous driving without human-labeled data

Creates pseudo steering angles using LiDAR-estimated vehicle poses

Proposes E2E networks for command prediction based on high-level instructions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised imitation learning for E2E driving

Pseudo steering angle prediction using LiDAR data

High-level instruction-based driving command networks

🔎 Similar Papers

Mitigating Covariate Shift in Imitation Learning for Autonomous Vehicles Using Latent Space Generative World Models