Multi-Source COVID-19 Detection via Variance Risk Extrapolation

📅 2025-06-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address domain shift in multi-center chest CT data arising from variations in imaging protocols, acquisition devices, and patient populations, this paper proposes a data augmentation framework integrating Variance Risk Extrapolation (VREx) and Mixup to learn domain-invariant representations and improve cross-center generalization for COVID-19 infection classification. VREx constrains the variance of empirical risks across source domains to suppress center-specific biases, while Mixup enhances model linearity, robustness, and noise resilience via input-label interpolation. Evaluated on four independent clinical centers, the method achieves an average macro-F1 score of 0.96—significantly outperforming baseline approaches—and demonstrates strong stability and generalizability. This work establishes an interpretable, deployment-friendly regularization paradigm for domain generalization in medical imaging.

Technology Category

Application Category

📝 Abstract
We present our solution for the Multi-Source COVID-19 Detection Challenge, which aims to classify chest CT scans into COVID and Non-COVID categories across data collected from four distinct hospitals and medical centers. A major challenge in this task lies in the domain shift caused by variations in imaging protocols, scanners, and patient populations across institutions. To enhance the cross-domain generalization of our model, we incorporate Variance Risk Extrapolation (VREx) into the training process. VREx encourages the model to maintain consistent performance across multiple source domains by explicitly minimizing the variance of empirical risks across environments. This regularization strategy reduces overfitting to center-specific features and promotes learning of domain-invariant representations. We further apply Mixup data augmentation to improve generalization and robustness. Mixup interpolates both the inputs and labels of randomly selected pairs of training samples, encouraging the model to behave linearly between examples and enhancing its resilience to noise and limited data. Our method achieves an average macro F1 score of 0.96 across the four sources on the validation set, demonstrating strong generalization.
Problem

Research questions and friction points this paper is trying to address.

Classify COVID-19 in CT scans across multiple hospitals
Address domain shift from imaging and patient differences
Improve cross-domain generalization using VREx and Mixup
Innovation

Methods, ideas, or system contributions that make the work stand out.

Variance Risk Extrapolation for domain generalization
Mixup data augmentation for robustness
Minimizing empirical risk variance across domains
🔎 Similar Papers
No similar papers found.
R
Runtian Yuan
Fudan University
Q
Qingqiu Li
Fudan University
Junlin Hou
Junlin Hou
HKUST | Fudan University
Computer VisionMedical Image AnalysisLabel-efficient Deep LearningeXplainable AI
Jilan Xu
Jilan Xu
Fudan University
Computer VisionMultimodalMedical Image Analysis
Y
Yuejie Zhang
Fudan University
R
Rui Feng
Fudan University
H
Hao Chen
The Hong Kong University of Science and Technology