Metric Hub: A metric library and practical selection workflow for use-case-driven data quality assessment in medical AI

📅 2026-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the critical gap in systematic data quality assessment for medical AI, which hinders the development and clinical deployment of trustworthy AI systems. Building upon the METRIC framework, the authors introduce the first comprehensive Metric Hub—a curated repository of data quality metrics tailored for medical AI—along with a structured metric selection protocol and a decision tree to guide context-specific metric choice. Innovatively organizing data quality dimensions through metric cards and adopting a use-case-driven strategy, the approach is validated on the PTB-XL electrocardiogram dataset. The work establishes a standardized, actionable pathway for evaluating whether training and testing data are “fit-for-purpose” in medical AI applications, thereby advancing rigorous and transparent data quality practices in the field.

Technology Category

Application Category

📝 Abstract
Machine learning (ML) in medicine has transitioned from research to concrete applications aimed at supporting several medical purposes like therapy selection, monitoring and treatment. Acceptance and effective adoption by clinicians and patients, as well as regulatory approval, require evidence of trustworthiness. A major factor for the development of trustworthy AI is the quantification of data quality for AI model training and testing. We have recently proposed the METRIC-framework for systematically evaluating the suitability (fit-for-purpose) of data for medical ML for a given task. Here, we operationalize this theoretical framework by introducing a collection of data quality metrics - the metric library - for practically measuring data quality dimensions. For each metric, we provide a metric card with the most important information, including definition, applicability, examples, pitfalls and recommendations, to support the understanding and implementation of these metrics. Furthermore, we discuss strategies and provide decision trees for choosing an appropriate set of data quality metrics from the metric library given specific use cases. We demonstrate the impact of our approach exemplarily on the PTB-XL ECG-dataset. This is a first step to enable fit-for-purpose evaluation of training and test data in practice as the base for establishing trustworthy AI in medicine.
Problem

Research questions and friction points this paper is trying to address.

data quality
medical AI
fit-for-purpose
metric selection
trustworthy AI
Innovation

Methods, ideas, or system contributions that make the work stand out.

metric library
data quality assessment
fit-for-purpose
medical AI
decision tree
🔎 Similar Papers
No similar papers found.
K
Katinka Becker
Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany.
M
Maximilian P. Oppelt
Department Digital Health and Analytics, Fraunhofer IIS, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany.
T
Tobias S. Zech
Department Digital Health and Analytics, Fraunhofer IIS, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany.
M
Martin Seyferth
Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany.
S
S. Cabon
Univ Rennes, Inserm, LTSI - UMR 1099, F-35000 Rennes, France.
V
Vanja Miskovic
Nearlab, Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Milano, Italy.
I
Ivan Cimrák
Department of Software Technologies, Faculty of Management Science and Informatics, University of Žilina, 010 26 Žilina, Slovakia.
Michal Kozubek
Michal Kozubek
Masarykova univerzita
analýza obrazu
G
Giuseppe D'Avenio
National Centre Artificial Intelligence and Innovative Technologies for Health, Istituto Superiore di Sanità, Rome, Italy.
I
Ilaria Campioni
National Centre Artificial Intelligence and Innovative Technologies for Health, Istituto Superiore di Sanità, Rome, Italy.
J
Jana Fehr
QUEST Center for Responsible Research, Berlin Institute of Health (BIH), Charité Universitätsmedizin Berlin, Berlin, Germany.
K
Kanjar De
RISE Research Institutes of Sweden, Sweden.
I
Ismail Mahmoudi
CETIC Centre d’excellence en technologies de l’information et de la communication, Belgium.
E
Emílio Dolgener Cantú
Fraunhofer Institute for Telecommunications Heinrich-Hertz-Institute HHI, Berlin, Germany.
L
Laurenz Ottmann
Department Digital Health and Analytics, Fraunhofer IIS, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany.
A
Andreas Klaß
Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany.
G
Galaad Altares
Multitel Research Center, Mons, Belgium.
Jackie Ma
Jackie Ma
Fraunhofer HHI
M
M. AlirezaSalehi
RISE Research Institutes of Sweden, Sweden.
N
Nadine R. Lang-Richter
Department Digital Health and Analytics, Fraunhofer IIS, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany.
T
Tobias Schaeffter
Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany.
Daniel Schwabe
Daniel Schwabe
Professor of Informatics, Pontificia Universidade Católica (PUC), Rio de Janeiro (ret)
Model driven Socio Technical SystemsKnowledge GraphsSemantic WebKnowledge Management