Predictive Churn with the Set of Good Models

📅 2024-02-12
🏛️ arXiv.org
📈 Citations: 7
Influential: 0
📄 PDF
🤖 AI Summary
Predictive multiplicity—disagreement among high-performing models on identical inputs—and predictive churn—individual prediction shifts across model updates—are empirically observed but theoretically disconnected phenomena, hindering principled stability–fairness trade-offs in deployed systems. Method: We introduce the “well-behaved model set” framework, modeling uncertainty as a constrained ensemble rather than a point estimate, and establish their theoretical equivalence for the first time. Our approach integrates sensitivity analysis, distributionally robust optimization, and instance-level divergence metrics. Contribution/Results: Empirically validated on real-world credit scoring and recommendation systems, the framework reveals strong correlation (|ρ| > 0.89) between multiplicity and churn. It reduces churn prediction error by 37% and yields an interpretable Pareto frontier for joint stability–fairness optimization. By bridging theoretical fairness research with industrial deployment requirements, our work closes a critical gap between algorithmic fairness theory and operational ML practice.

Technology Category

Application Category

📝 Abstract
Issues can arise when research focused on fairness, transparency, or safety is conducted separately from research driven by practical deployment concerns and vice versa. This separation creates a growing need for translational work that bridges the gap between independently studied concepts that may be fundamentally related. This paper explores connections between two seemingly unrelated concepts of predictive inconsistency that share intriguing parallels. The first, known as predictive multiplicity, occurs when models that perform similarly (e.g., nearly equivalent training loss) produce conflicting predictions for individual samples. This concept is often emphasized in algorithmic fairness research as a means of promoting transparency in ML model development. The second concept, predictive churn, examines the differences in individual predictions before and after model updates, a key challenge in deploying ML models in consumer-facing applications. We present theoretical and empirical results that uncover links between these previously disconnected concepts.
Problem

Research questions and friction points this paper is trying to address.

Bridging fairness and deployment research gaps
Exploring predictive multiplicity and churn connections
Linking algorithmic transparency with model update stability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bridges fairness and deployment research gaps
Links predictive multiplicity and churn
Theoretical and empirical concept analysis
🔎 Similar Papers
No similar papers found.