🤖 AI Summary
This paper addresses model multiplicity in machine learning—the phenomenon wherein multiple distinct models achieve comparable predictive performance yet yield substantially different predictions or explanations, undermining decision reliability and introducing arbitrariness. Existing work suffers from conceptual ambiguity, conflation with uncertainty or variance, and lack of formal foundations. To overcome these limitations, we propose the first formal framework linking model *selection* to *arbitrariness*, generalizing multiplicity beyond predictions to non-predictive outputs (e.g., feature attributions, symbolic rules) and rigorously distinguishing it from statistical uncertainty and estimator variance. Through conceptual analysis, systematic terminology development, cross-paradigm comparison (e.g., frequentist vs. Bayesian, parametric vs. nonparametric), and integration with responsible AI principles, we establish the first unified theoretical framework for multiplicity. This framework systematically categorizes associated risks and societal value, and identifies core open problems—including multiplicity-aware evaluation, optimization, and governance—charting key directions for future research.
📝 Abstract
Algorithmic modelling relies on limited information in data to extrapolate outcomes for unseen scenarios, often embedding an element of arbitrariness in its decisions. A perspective on this arbitrariness that has recently gained interest is multiplicity-the study of arbitrariness across a set of"good models", i.e., those likely to be deployed in practice. In this work, we systemize the literature on multiplicity by: (a) formalizing the terminology around model design choices and their contribution to arbitrariness, (b) expanding the definition of multiplicity to incorporate underrepresented forms beyond just predictions and explanations, (c) clarifying the distinction between multiplicity and other traditional lenses of arbitrariness, i.e., uncertainty and variance, and (d) distilling the benefits and potential risks of multiplicity into overarching trends, situating it within the broader landscape of responsible AI. We conclude by identifying open research questions and highlighting emerging trends in this young but rapidly growing area of research.