How to measure intra-physician variability in clinical decision-making?

📅 2026-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the lack of effective and validated methods for quantifying inconsistency in clinical decision-making. The authors introduce the first controllable synthetic data benchmark to systematically evaluate the accuracy and rank fidelity of eight measurement approaches across 94 experimental conditions. The evaluated methods include Euclidean distance, Mahalanobis distance, learned weight matching, genetic Mahalanobis distance, random forest proximity, mutual information weighting, latent profile analysis, and Bayesian generalized linear mixed models (GLMMs). Results show that learned weight matching achieves the lowest error (MAE = 0.027), while supervised feature-weighting methods and GLMMs maintain high rank correlation (Spearman = 0.62–0.68) under scenarios involving continuous heterogeneity, demonstrating their robustness and practical applicability.
📝 Abstract
Intra-physician prescribing variability, the probability that one physician issues discordant decisions for two patients deemed comparable on observed covariates, holds great impact in quality of care, safety and cost. However, there are no known validated measurement methods. Here, we benchmark eight methods (Euclidean, Mahalanobis, Learned-Weights, Genetic Mahalanobis, Random Forest proximity, Mutual-Information-weighted, Latent Profile Analysis and Bayesian binomial generalized linear mixed model) against a synthetic ground truth across 94 experimental conditions. Learned-Weights matching achieves the lowest mean absolute error (0.027), followed by Mutual-Information-weighted matching (0.028) and RF Proximity (0.034). All eight discordance-analysis methods preserve the physician rank ordering with high fidelity (Spearman > 0.89 versus the ground truth on the SCORE2 experiment), as long as the physician variability groups are well separated. Under a continuous-heterogeneity physician model, rank preservation degrades substantially for unsupervised methods (Spearman = [0.28, 0.35]) but is retained by supervised feature-weighted methods and the GLMM (Spearman = [0.62, 0.68]). This controlled methodological evaluation is a foundation for validation on observational prescribing data. Once validated on observational prescribing data, these evaluated open-source estimators could turn prescribing inconsistency into a routinely measurable clinician-level quality metric, systematically complementing the existing literature on between-physician variation.
Problem

Research questions and friction points this paper is trying to address.

intra-physician variability
clinical decision-making
prescribing inconsistency
quality of care
measurement methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

intra-physician variability
prescribing inconsistency
feature-weighted matching
methodological benchmarking
clinical decision-making
🔎 Similar Papers
No similar papers found.
A
Alaedine Benani
Preventive Medicine, Data Science and AI Lab, Zoï, F-75010 Paris, France
P
Pierre Meneton
Inserm, Sorbonne Université, Université Sorbonne Paris-Nord, Limics, F-75006 Paris, France
E
Emmanuel Messas
Département cardio-vasculaire, Hôpital européen Georges-Pompidou, université Paris Cité, Inserm UMR 970, F-75015 Paris, France
L
Liza Hettal
Institut de Cancérologie de Lorraine, Vandoeuvre-lès-Nancy; Université de Lorraine, Nancy, France
S
Sai Sagireddy
European University Cyprus Frankfurt School of Medicine
D
Damien Grosgeorge
Preventive Medicine, Data Science and AI Lab, Zoï, F-75010 Paris, France
J
Jérôme Salomon
Preventive Medicine, Data Science and AI Lab, Zoï, F-75010 Paris, France
S
Sylvain Bodard
Université de Paris Cité, AP-HP, Hôpital Universitaire Necker Enfants Malades, F-75015 Paris, France; CNRS UMR 7371, INSERM U 1146, LIB, Sorbonne Université, F-75006 Paris, France; Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
Xavier Tannier
Xavier Tannier
Sorbonne Université, Limics
Natural Language ProcessingInformation ExtractionBioNLP