PTB-Image: A Scanned Paper ECG Dataset for Digitization and Image-based Diagnosis

πŸ“… 2025-02-19
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the clinical challenge of automated analysis and digital archiving of paper-based electrocardiogram (ECG) recordings, this work introduces PTB-Imageβ€”the first publicly available dataset featuring strictly paired scanned paper-ECG images and their corresponding high-fidelity digital ECG signals. Methodologically, we propose VinDigitizer, an end-to-end framework integrating precise signal-line localization, adaptive waveform extraction, geometric distortion correction, and pixel-intensity-to-voltage mapping to achieve holistic image-to-time-series digitization. Evaluated on 549 real-world scanned ECGs, VinDigitizer achieves a mean signal-to-noise ratio of 0.01 dB, demonstrating high-fidelity signal reconstruction. This work establishes the first rigorously paired benchmark for paper-based ECGs and provides a systematic digitization methodology, enabling large-scale reuse of legacy ECG data, telecardiology applications, and automated cardiac diagnostics.

Technology Category

Application Category

πŸ“ Abstract
Electrocardiograms (ECGs) recorded on paper remain prevalent in clinical practice, yet their use presents challenges for automated analysis and digital storage. To address this issue, we introduce PTB-Image, a dataset comprising scanned paper ECGs with corresponding digital signals, enabling research on ECG digitization. We also provide VinDigitizer, a digitization baseline to convert paper-based ECGs into digital time-series signals. The method involves detecting signal rows, extracting waveforms from the background, and reconstructing numerical values from the digitized traces. We applied VinDigitizer to 549 scanned ECGs and evaluated its performance against the original PTB dataset (modified to match the printed signals). The results achieved a mean signal-to-noise ratio (SNR) of 0.01 dB, highlighting both the feasibility and challenges of ECG digitization, particularly in mitigating distortions from printing and scanning processes. By providing PTB-Image and baseline digitization methods, this work aims to facilitate advancements in ECG digitization, enhancing access to historical ECG data and supporting applications in telemedicine and automated cardiac diagnostics.
Problem

Research questions and friction points this paper is trying to address.

Digitize paper-based ECGs for digital analysis.
Mitigate distortions from printing and scanning.
Enhance access to historical ECG data.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces PTB-Image dataset
Provides VinDigitizer for digitization
Detects and reconstructs ECG signals
πŸ”Ž Similar Papers
No similar papers found.
Cuong V. Nguyen
Cuong V. Nguyen
Durham University
machine learningartificial intelligencestatistics
H
Hieu X. Nguyen
College of Engineering and Computer Science, VinUniversity, Hanoi, Vietnam
D
Dung D. Pham Minh
College of Engineering and Computer Science, VinUniversity, Hanoi, Vietnam
C
Cuong D. Do
College of Engineering and Computer Science, VinUni-Illinois Smart Health Center, VinUniversity, Hanoi, Vietnam