🤖 AI Summary
Machine learning systems are widely deployed in sensitive domains, yet their potentially discriminatory decisions raise serious fairness concerns. Although numerous open-source fairness APIs exist for bias detection and mitigation, developers commonly face challenges—including knowledge gaps, technical barriers, and resource dependencies. This paper presents the first large-scale empirical study of fairness tool usage, analyzing 204 GitHub repositories employing 13 mainstream fairness tools. We identify two primary application objectives and 17 concrete use cases. Through systematic code review, documentation analysis, and community discussion mining, we categorize 12 representative obstacles spanning technical implementation and domain understanding. Our findings reveal a pronounced “capability–tool” gap in fairness practice—i.e., a misalignment between developers’ proficiency and tool requirements. The study provides empirically grounded insights and actionable recommendations for fairness-aware tool design, engineering education, and software development process improvement.
📝 Abstract
Machine Learning software systems are frequently used in our day-to-day lives. Some of these systems are used in various sensitive environments to make life-changing decisions. Therefore, it is crucial to ensure that these AI/ML systems do not make any discriminatory decisions for any specific groups or populations. In that vein, different bias detection and mitigation open-source software libraries (aka API libraries) are being developed and used. In this paper, we conduct a qualitative study to understand in what scenarios these open-source fairness APIs are used in the wild, how they are used, and what challenges the developers of these APIs face while developing and adopting these libraries. We have analyzed 204 GitHub repositories (from a list of 1885 candidate repositories) which used 13 APIs that are developed to address bias in ML software. We found that these APIs are used for two primary purposes (i.e., learning and solving real-world problems), targeting 17 unique use-cases. Our study suggests that developers are not well-versed in bias detection and mitigation; they face lots of troubleshooting issues, and frequently ask for opinions and resources. Our findings can be instrumental for future bias-related software engineering research, and for guiding educators in developing more state-of-the-art curricula.