🤖 AI Summary
This work addresses the dual challenges of class imbalance and algorithmic fairness in dynamic data stream settings. We propose the first fairness-aware continuous SMOTE preprocessing method, which integrates a contextual testing mechanism into the oversampling process to actively identify and balance fairness-sensitive subgroups—thereby avoiding trade-off pitfalls arising from optimizing a single fairness metric. Our approach employs a streaming incremental update strategy, yielding a model-agnostic preprocessing framework. Experimental results demonstrate that the method significantly outperforms C-SMOTE across multiple group fairness metrics—including statistical parity difference (SPD) and equalized odds difference (EOD)—while achieving predictive performance on par with state-of-the-art fairness-aware stream learning algorithms. To our knowledge, this is the first approach to jointly and effectively govern class imbalance and algorithmic fairness within a unified streaming framework.
📝 Abstract
As machine learning is increasingly applied in an online fashion to deal with evolving data streams, the fairness of these algorithms is a matter of growing ethical and legal concern. In many use cases, class imbalance in the data also needs to be dealt with to ensure predictive performance. Current fairness-aware stream learners typically attempt to solve these issues through in- or post-processing by focusing on optimizing one specific discrimination metric, addressing class imbalance in a separate processing step. While C-SMOTE is a highly effective model-agnostic pre-processing approach to mitigate class imbalance, as a side effect of this method, algorithmic bias is often introduced. Therefore, we propose CFSMOTE - a fairness-aware, continuous SMOTE variant - as a pre-processing approach to simultaneously address the class imbalance and fairness concerns by employing situation testing and balancing fairness-relevant groups during oversampling. Unlike other fairness-aware stream learners, CFSMOTE is not optimizing for only one specific fairness metric, therefore avoiding potentially problematic trade-offs. Our experiments show significant improvement on several common group fairness metrics in comparison to vanilla C-SMOTE while maintaining competitive performance, also in comparison to other fairness-aware algorithms.