dsld: A Socially Relevant Tool for Teaching Statistics

📅 2024-11-06
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses systemic bias against protected groups—such as race, gender, and age—in predictive algorithms. To this end, we introduce DSLD, the first open-source toolkit integrating statistical modeling, causal inference, and legal compliance perspectives, available for both R and Python. Methodologically, DSLD unifies confounder identification, bias quantification (e.g., disparate impact, equalized odds), counterfactual debiasing, and interactive visualization. It is accompanied by an 80-page Quarto-based pedagogical manual that bridges theory, implementation, and real-world discrimination case studies. Our key contribution is a transparent, reproducible, interdisciplinary fairness analysis workflow—designed for auditability and regulatory alignment—that has been successfully deployed in university statistics instruction and public policy evaluation. Empirical results demonstrate significant improvements in learners’ ability to empirically detect algorithmic bias and in legal practitioners’ capacity to interpret technical fairness assessments.

Technology Category

Application Category

📝 Abstract
The growing power of data science can play a crucial role in addressing social discrimination, necessitating nuanced understanding and effective mitigation strategies for biases."Data Science Looks At Discrimination"(DSLD) is an R and Python package designed to provide users with a comprehensive toolkit of statistical and graphical methods for assessing possible discrimination related to protected groups such as race, gender, and age. The package addresses critical issues by identifying and mitigating confounders and reducing bias against protected groups in prediction algorithms. In educational settings, DSLD offers instructors powerful tools to teach statistical principles through motivating real world examples of discrimination analysis. The inclusion of an 80 page Quarto book further supports users from statistics educators to legal professionals in effectively applying these analytical tools to real world scenarios.
Problem

Research questions and friction points this paper is trying to address.

Develops tools to assess discrimination in data science
Mitigates bias in algorithms for protected groups
Teaches statistics using real-world discrimination examples
Innovation

Methods, ideas, or system contributions that make the work stand out.

R and Python package for discrimination analysis
Toolkit for statistical and graphical bias assessment
Includes Quarto book for real-world application guidance
🔎 Similar Papers
No similar papers found.
T
Taha Abdullah
Department of Computer Science, University of California, Davis
Arjun Ashok
Arjun Ashok
Mila-Quebec AI Institute, ServiceNow Research
time seriesforecastingnatural language processing
B
Brandon Estrada
Department of Computer Science, University of California, Davis
N
Norman Matloff
Department of Computer Science, University of California, Davis
Aditya Mittal
Aditya Mittal
Professor of Biological Sciences, Indian Institute of Technology Delhi
BiologyBiophysicsBiochemistry