A Novel Multi-layer Task-centric and Data Quality Framework for Autonomous Driving

πŸ“… 2025-06-19
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Autonomous driving research has long prioritized algorithm development over data quality (DQ), while quality fluctuations in heterogeneous, multi-source sensor data severely impair system functionality, efficiency, and trustworthiness. To address this, we propose the first task-centric, five-layer DQ framework that dynamically couples DQ assessment into the perception task loop, enabling interpretable mapping between DQ metrics, task requirements, and performance objectives. We innovatively identify intra- and cross-modal redundancy issues in image and LiDAR data, and introduce a controllable redundancy reduction mechanism. Empirical evaluation on nuScenes demonstrates that moderate image redundancy removal improves YOLOv8 mAP by 2.3%; we also present the first systematic characterization of image–LiDAR cross-modal redundancy defects. Our methodology integrates DQ quantification, task sensitivity analysis, multi-modal redundancy modeling, and empirical validation. We publicly release all code, datasets, and documentation.

Technology Category

Application Category

πŸ“ Abstract
The next-generation autonomous vehicles (AVs), embedded with frequent real-time decision-making, will rely heavily on a large volume of multisource and multimodal data. In real-world settings, the data quality (DQ) of different sources and modalities usually varies due to unexpected environmental factors or sensor issues. However, both researchers and practitioners in the AV field overwhelmingly concentrate on models/algorithms while undervaluing the DQ. To fulfill the needs of the next-generation AVs with guarantees of functionality, efficiency, and trustworthiness, this paper proposes a novel task-centric and data quality vase framework which consists of five layers: data layer, DQ layer, task layer, application layer, and goal layer. The proposed framework aims to map DQ with task requirements and performance goals. To illustrate, a case study investigating redundancy on the nuScenes dataset proves that partially removing redundancy on multisource image data could improve YOLOv8 object detection task performance. Analysis on multimodal data of image and LiDAR further presents existing redundancy DQ issues. This paper opens up a range of critical but unexplored challenges at the intersection of DQ, task orchestration, and performance-oriented system development in AVs. It is expected to guide the AV community toward building more adaptive, explainable, and resilient AVs that respond intelligently to dynamic environments and heterogeneous data streams. Code, data, and implementation details are publicly available at: https://anonymous.4open.science/r/dq4av-framework/README.md.
Problem

Research questions and friction points this paper is trying to address.

Addresses varying data quality in autonomous vehicle multisource data
Proposes task-centric framework linking data quality to performance goals
Investigates redundancy impact on object detection in multimodal datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-layer framework for AV data quality
Task-centric mapping of DQ to performance
Redundancy removal improves detection accuracy
πŸ”Ž Similar Papers