🤖 AI Summary
Data-driven software in socially critical domains often exhibits undetectable fairness violations and poor interpretability due to developers’ insufficient understanding of decision logic and explicit/implicit biases embedded in training data.
Method: We propose the first explainable fairness debugging framework integrating counterfactual fairness testing with human-in-the-loop debugging. It employs counterfactual instance generation, fairness sensitivity analysis, and multi-model fairness–accuracy trade-off comparison to visualize data logic, decision paths, and individual attributions. Crucially, it supports out-of-training-set detection of implicit fairness defects—a novel capability.
Contribution/Results: We empirically uncover human cognitive biases in interpreting counterfactual examples. Two empirical studies demonstrate significant improvements in fairness defect detection rates and substantial reductions in false positives and false negatives. The open-source tool and benchmark dataset have been publicly released and widely adopted by the research community.
📝 Abstract
Data-driven software solutions have significantly been used in critical domains with significant socio-economic, legal, and ethical implications. The rapid adoptions of data-driven solutions, however, pose major threats to the trustworthiness of automated decision-support software. A diminished understanding of the solution by the developer and historical/current biases in the data sets are primary challenges. To aid data-driven software developers and end-users, we present FairLay-ML, a debugging tool to test and explain the fairness implications of data-driven solutions. FairLay-ML visualizes the logic of datasets, trained models, and decisions for a given data point. In addition, it trains various models with varying fairness-accuracy trade-offs. Crucially, FairLay-ML incorporates counterfactual fairness testing that finds bugs beyond the development datasets. We conducted two studies through FairLay-ML that allowed us to measure false positives/negatives in prevalent counterfactual testing and understand the human perception of counterfactual test cases in a class survey. FairLay-ML and its benchmarks are publicly available at https://github.com/Pennswood/FairLay-ML. The live version of the tool is available at https://fairlayml-v2.streamlit.app/. We provide a video demo of the tool at https://youtu.be/wNI9UWkywVU?t=133.