🤖 AI Summary
Ultrasound imaging systems typically rely on the DICOM protocol for image transmission, hindering rapid algorithm validation and prototype development. To address this, we propose a fully self-supervised method that robustly extracts high-fidelity medical ultrasound images from unconstrained smartphone-captured screenshots of ultrasound display monitors—without requiring manual annotations. Our approach jointly optimizes geometric distortion correction and illumination normalization by integrating self-supervised image registration with perspective transformation, effectively mitigating challenges posed by arbitrary capture conditions—including scale variation, viewing angle, and brightness inconsistency. Evaluated on cardiac ultrasound view classification, the corrected images achieve a balanced accuracy of 0.79, substantially outperforming the uncorrected baseline. To our knowledge, this is the first end-to-end, annotation-free framework for extracting clinically meaningful ultrasound content directly from real-world screen captures, establishing a new paradigm for agile AI development in clinical ultrasound applications.
📝 Abstract
Ultrasound (US) machines display images on a built-in monitor, but routine transfer to hospital systems relies on DICOM. We propose a self-supervised pipeline to extract the US image from a photograph of the monitor. This removes the DICOM bottleneck and enables rapid testing and prototyping of new algorithms. In a proof-of-concept study, the rectified images retained enough visual fidelity to classify cardiac views with a balanced accuracy of 0.79 with respect to the native DICOMs.