DataPro - A Standardized Data Understanding and Processing Procedure: A Case Study of an Eco-Driving Project

📅 2025-01-21
🏛️ EI.A
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address modeling bias arising from semantic ambiguity, missing annotations, and heterogeneous quality in multi-source vehicular data for eco-driving, this paper proposes a domain-knowledge-embedded standardized data engineering framework. Methodologically, we establish a hierarchical data maturity assessment system and design a reusable four-stage pipeline—comprising understanding, cleaning, augmentation, and alignment—that integrates rule-based cleaning, lightweight active learning for annotation, spatiotemporal consistency verification, and driving-behavior-graph-guided data augmentation. Our key contribution lies in the first explicit standardization of the data engineering process and its deep coupling with traffic-domain constraints. Evaluated on real-world fleet data, the framework improves prediction accuracy of downstream energy-saving strategy models by 12.7% and reduces data preparation time by 64%. The pipeline has been successfully reused across three traffic AI projects.

Technology Category

Application Category

Problem

Research questions and friction points this paper is trying to address.

Data Handling
Interdepartmental Collaboration
Business Decision Support
Innovation

Methods, ideas, or system contributions that make the work stand out.

DataPro
CRISP-DM improvement
Technical Understanding and Implementation
🔎 Similar Papers
No similar papers found.