COVID-BLUeS -- A Prospective Study on the Value of AI in Lung Ultrasound Analysis

📅 2025-09-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing lung ultrasound (LUS) AI models suffer from poor generalizability due to low-quality data, frame-level modeling that disregards temporal dynamics, and insufficient integration of clinical context. Method: We prospectively collected high-fidelity LUS video sequences and matched complete blood count (CBC) data from 63 suspected COVID-19 patients, establishing the first publicly available multimodal LUS-CBC benchmark dataset. We further propose an end-to-end video–clinical multimodal fusion framework enabling zero-shot transfer. Contribution/Results: Our multimodal model achieves 79% accuracy in COVID-19 identification—significantly outperforming unimodal baselines and clinician annotations. The study demonstrates the feasibility of AI-assisted LUS analysis for both infection detection and severity assessment, while revealing critical cross-site generalization bottlenecks. It provides a foundational dataset, architectural paradigm, and empirical validation for clinically deployable LUS AI systems.

Technology Category

Application Category

📝 Abstract
As a lightweight and non-invasive imaging technique, lung ultrasound (LUS) has gained importance for assessing lung pathologies. The use of Artificial intelligence (AI) in medical decision support systems is promising due to the time- and expertise-intensive interpretation, however, due to the poor quality of existing data used for training AI models, their usability for real-world applications remains unclear. In a prospective study, we analyze data from 63 COVID-19 suspects (33 positive) collected at Maastricht University Medical Centre. Ultrasound recordings at six body locations were acquired following the BLUE protocol and manually labeled for severity of lung involvement. Several AI models were applied and trained for detection and severity of pulmonary infection. The severity of the lung infection, as assigned by human annotators based on the LUS videos, is not significantly different between COVID-19 positive and negative patients (p = 0.89). Nevertheless, the predictions of image-based AI models identify a COVID-19 infection with 65% accuracy when applied zero-shot (i.e., trained on other datasets), and up to 79% with targeted training, whereas the accuracy based on human annotations is at most 65%. Multi-modal models combining images and CBC improve significantly over image-only models. Although our analysis generally supports the value of AI in LUS assessment, the evaluated models fall short of the performance expected from previous work. We find this is due to 1) the heterogeneity of LUS datasets, limiting the generalization ability to new data, 2) the frame-based processing of AI models ignoring video-level information, and 3) lack of work on multi-modal models that can extract the most relevant information from video-, image- and variable-based inputs. To aid future research, we publish the dataset at: https://github.com/NinaWie/COVID-BLUES.
Problem

Research questions and friction points this paper is trying to address.

Evaluating AI accuracy in diagnosing COVID-19 via lung ultrasound
Assessing limitations of existing AI models for real-world LUS applications
Identifying data heterogeneity and multimodal integration challenges in LUS analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

AI models for lung infection detection
Multi-modal combining images and CBC
Frame-based processing ignoring video information
🔎 Similar Papers
No similar papers found.
Nina Wiedemann
Nina Wiedemann
Intel, ETH Zürich
geographic information sciencecomputer visiondiscrete optimization
D
Dianne de Korte-de Boer
Department of Anesthesiology & Pain Medicine, Maastricht UMC+, Maastricht, the Netherlands
M
Matthias Richter
Institute of Cognitive Science, University of Osnabrück, Germany
S
Sjors van de Weijer
Center for Acute and Critical Care, Maastricht UMC+, Maastricht, the Netherlands
C
Charlotte Buhre
Faculty of Health Sciences (FGW), Joint Faculty of the University of Potsdam, the Brandenburg Medical School Theodor Fontane and the Brandenburg Technical University Cottbus-Senftenberg, Germany
F
Franz A. M. Eggert
Faculty of Health Sciences (FGW), Joint Faculty of the University of Potsdam, the Brandenburg Medical School Theodor Fontane and the Brandenburg Technical University Cottbus-Senftenberg, Germany; Department of Neurosurgery, School for Mental Health and Neuroscience, Maastricht University, the Netherlands
S
Sophie Aarnoudse
Centre for Acute and Critical Care, Emergency Department, Centre for Chronic Diseases, Department of Medicine, Maastricht UMC+, Maastricht, the Netherlands
L
Lotte Grevendonk
Department of Anesthesiology & Pain Medicine, Maastricht UMC+, Maastricht, the Netherlands
S
Steffen Röber
Institute of Cognitive Science, University of Osnabrück, Germany
C
Carlijn M. E. Remie
Department of Anesthesiology & Pain Medicine, Maastricht UMC+, Maastricht, the Netherlands
W
Wolfgang Buhre
Department of Anesthesiology & Pain Medicine, Maastricht UMC+, Maastricht, the Netherlands; Department of Anaesthesiology and Division of Vital Functions, University Medical Center Utrecht, Utrecht, the Netherlands
Ronald Henry
Ronald Henry
Centre for Acute and Critical Care, Emergency Department, Centre for Chronic Diseases, Department of Medicine, Maastricht UMC+, Maastricht, the Netherlands
Jannis Born
Jannis Born
IBM Research
AI 4 ScienceLanguage ModelsQuantum MLMachine Learning