Domain Generalization and Adaptation in Intensive Care with Anchor Regression

📅 2025-07-29
📈 Citations: 0
✨ Influential: 0
📄 PDF
🤖 AI Summary
Clinical prediction models suffer substantial performance degradation when deployed across hospitals due to distributional shift. To address heterogeneity in multicenter ICU data, this work proposes a causally inspired domain generalization framework. First, we introduce anchor regression and its nonlinear extension—anchor boosting—built upon tree-based models. Second, we develop a framework to quantify the utility of external data, explicitly distinguishing among three decision regimes: domain generalization, domain adaptation, and data enrichment. Third, we incorporate causal regularization to enhance robust out-of-distribution (OOD) generalization. Extensive experiments across nine diverse ICU databases comprising 400,000 patients demonstrate that our method significantly improves OOD predictive performance—particularly under large source–target domain discrepancies—and exhibits strong robustness to mild violations of underlying causal assumptions.

Technology Category

Application Category

📝 Abstract
The performance of predictive models in clinical settings often degrades when deployed in new hospitals due to distribution shifts. This paper presents a large-scale study of causality-inspired domain generalization on heterogeneous multi-center intensive care unit (ICU) data. We apply anchor regression and introduce anchor boosting, a novel, tree-based nonlinear extension, to a large dataset comprising 400,000 patients from nine distinct ICU databases. The anchor regularization consistently improves out-of-distribution performance, particularly for the most dissimilar target domains. The methods appear robust to violations of theoretical assumptions, such as anchor exogeneity. Furthermore, we propose a novel conceptual framework to quantify the utility of large external data datasets. By evaluating performance as a function of available target-domain data, we identify three regimes: (i) a domain generalization regime, where only the external model should be used, (ii) a domain adaptation regime, where refitting the external model is optimal, and (iii) a data-rich regime, where external data provides no additional value.
Problem

Research questions and friction points this paper is trying to address.

Addressing predictive model degradation across hospitals due to distribution shifts
Introducing anchor boosting for domain generalization in ICU data
Quantifying utility of external data in domain adaptation regimes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Anchor regression for ICU domain generalization
Anchor boosting as tree-based nonlinear extension
Framework quantifying external data utility
🔎 Similar Papers
No similar papers found.