A Framework for Transparent Reporting of Data Quality Analysis Across the Clinical Electronic Health Record Data Lifecycle

📅 2026-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of unreliable AI models and diminished clinical trust stemming from opaque data quality reporting in the secondary use of electronic health records (EHRs). To this end, the authors propose the first comprehensive framework for transparent data quality reporting across the entire EHR lifecycle. The framework innovatively distinguishes between data producers and consumers, explicitly defines five critical phases, and maps established data quality dimensions to specific workflow stages. Through iterative stakeholder and process analysis, a structured reporting mechanism is developed and validated on real-world datasets, demonstrating its ability to effectively trace the origins of data quality issues. The approach significantly enhances data interpretability, fitness-for-use, and governance efficacy, thereby providing a robust foundation for trustworthy AI development and clinical research.

Technology Category

Application Category

📝 Abstract
Data quality (DQ) and transparency of secondary data are critical factors that delay the adoption of clinical AI models and affect clinician trust in them. Many DQ studies fail to clarify where, along the lifecycle, quality checks occur, leading to uncertainty about provenance and fitness for reuse. This study develops a framework for transparent reporting of DQ assessments across the clinical electronic health record (EHR) data lifecycle. The reporting framework was developed through iterative analysis to identify actors and phases of the clinical data lifecycle. The framework distinguishes between data-generating organizations and data-receiving organizations to allow users to map DQ parameters to stages across the data lifecycle. The framework defines 5 key lifecycle phases and multiple actors. When applied to the real-world dataset, the framework demonstrated applicability in revealing where DQ issues may originate. The framework provides a structured approach for reporting DQ assessments, which can enhance transparency regarding data fitness for reuse, supporting reliable clinical research, AI model development, and internal organisational governance. This work provides practical guidance for researchers to understand data provenance and for organisations to target DQ improvement efforts across the data lifecycle.
Problem

Research questions and friction points this paper is trying to address.

data quality
electronic health record
data lifecycle
transparency
clinical AI
Innovation

Methods, ideas, or system contributions that make the work stand out.

data quality
electronic health records
data lifecycle
transparency
clinical AI
🔎 Similar Papers
No similar papers found.