🤖 AI Summary
In contemporary digital markets, the exposure of personal data revealing intersecting identities (e.g., race, gender, and disability) generates privacy externalities: firms profit from data exploitation while users bear societal risks such as discrimination. To internalize these privacy losses, we propose a mutual-information-based data pricing mechanism that quantifies entropy reduction induced by data usage and implements a Pigouvian tax to curb socially harmful transactions. Innovatively integrating information theory with intersectionality theory, our framework yields a model-agnostic pricing rule—requiring no assumptions about underlying statistical models—and enables regulators to dynamically adjust levies according to social welfare objectives, thereby supporting both corrective and redistributive policy goals. Computationally tractable via discretization of the cross-joint distribution, it accommodates parametric, nonparametric, and learning-based models. This is the first approach to quantitatively balance data value and societal risk across multiple sensitive attributes, offering an adaptable, fair, and transparent regulatory instrument for data markets.
📝 Abstract
In contemporary digital markets, personal data often reveals not just isolated traits, but complex, intersectional identities based on combinations of race, gender, disability, and other protected characteristics. This exposure generates a privacy externality: firms benefit economically from profiling, prediction, and personalization, while users face hidden costs in the form of social risk and discrimination. We introduce a formal pricing rule that quantifies and internalizes this intersectional privacy loss using mutual information, assigning monetary value to the entropy reduction induced by each datum. The result is a Pigouvian-style surcharge that discourages harmful data trades and rewards transparency. Our formulation has the advantage that it operates independently of the underlying statistical model of the intersectional variables, be it parametric, nonparametric, or learned, and can be approximated in practice by discretizing the intersectional joint probability distributions. We illustrate how regulators can calibrate this surcharge to reflect different societal values, and argue that it provides not just a technical fix to market failures, but also a redistributive shield that empowers vulnerable groups in the face of asymmetric digital power.