Learning Interpretable Point-Based Clinical Risk Scores via Direct Optimization

📅 2026-05-18

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

This work proposes a flexible greedy optimization strategy to directly learn interpretable additive risk scoring models with non-negative integer weights, circumventing the conventional two-stage pipeline of regression followed by rounding. Traditional clinical risk scores often rely on post-hoc rounding of regression coefficients, which fails to guarantee optimality, while exact integer programming approaches incur prohibitive computational costs. The proposed method integrates integer constraints, customizable utility functions, and electronic health record (EHR) data to simultaneously achieve interpretability, sparsity, and computational efficiency under an explicit optimality objective. Applied to the large-scale Epic Cosmos EHR cohort, the approach successfully constructs an integer-weighted comorbidity score for predicting post-discharge mortality risk, and simulation studies demonstrate its strong performance even with limited sample sizes.

📝 Abstract

Many clinical risk scores are deployed as additive rules with nonnegative integer points assigned to relevant binary predictive features. These integer weights not only make the score easier to use in practice but also promote sparsity in the resulting prediction model. Such risk scores are often derived by first fitting a regression model and then rounding the estimated coefficients to the nearest integer after appropriate scaling. This approach is computationally fast but does not guarantee optimality of the resulting score. Alternatively, one may search over all possible integer weights to directly optimize a value function by posing the problem as an integer programming task. However, the associated computational burden can be substantial, especially when the value function is nonconcave or even discontinuous. In this paper, we develop new machine learning algorithms that employ a flexible greedy optimization strategy to learn such additive scoring directly under explicit and sensible optimality objectives. We apply the proposed method to a large electronic health record (EHR) cohort in Epic Cosmos to construct an integer-weighted comorbidity score for measuring the risk of post-discharge mortality. We also conduct a simulation study to examine the finite-sample operating characteristics.

Problem

Research questions and friction points this paper is trying to address.

clinical risk scores

integer programming

interpretable machine learning

additive scoring systems

optimality

Innovation

Methods, ideas, or system contributions that make the work stand out.

interpretable scoring

integer optimization

greedy algorithm