π€ AI Summary
This paper addresses the long-term fairness degradation of machine learning systems operating in dynamic environments. We propose a simulation-driven analytical framework that integrates Monte Carlo evolutionary simulation, configuration-space sensitivity analysis, and causal modeling. For the first time, this framework enables quantitative attribution of latent long-term unfairness arising from feedback loopsβmoving beyond traditional static, model-centric fairness evaluation paradigms. By explicitly simulating bidirectional interactions between the system and its environment, it identifies decision mechanisms that appear fair in the short term but exacerbate group-level biases over time. We validate the framework on three real-world applications: loan approval, opioid risk scoring, and predictive policing. It supports interpretable, counterfactual attribution of how design choices and environmental factors jointly shape long-term fairness outcomes. The approach provides a novel methodological foundation for developing AI systems with sustainable fairness guarantees.
π Abstract
Algorithmic fairness of machine learning (ML) models has raised significant concern in the recent years. Many testing, verification, and bias mitigation techniques have been proposed to identify and reduce fairness issues in ML models. The existing methods are model-centric and designed to detect fairness issues under static settings. However, many ML-enabled systems operate in a dynamic environment where the predictive decisions made by the system impact the environment, which in turn affects future decision-making. Such a self-reinforcing feedback loop can cause fairness violations in the long term, even if the immediate outcomes are fair. In this paper, we propose a simulation-based framework called FairSense to detect and analyze long-term unfairness in ML-enabled systems. Given a fairness requirement, FairSense performs Monte-Carlo simulation to enumerate evolution traces for each system configuration. Then, FairSense performs sensitivity analysis on the space of possible configurations to understand the impact of design options and environmental factors on the long-term fairness of the system. We demonstrate FairSense's potential utility through three real-world case studies: Loan lending, opioids risk scoring, and predictive policing.