Automatic Causal Fairness Analysis with LLM-Generated Reporting

📅 2026-04-29

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work addresses the notable gap in existing AutoML frameworks, which generally lack automated capabilities for evaluating fairness in both training data and predictions. The study proposes a novel approach that integrates causal fairness theory with large language models (LLMs), grounded in a standard structural causal model. By leveraging counterfactual reasoning and closed-form causal effect estimation, the method accommodates ordinal protected attributes and continuous outcome variables, enhanced by a new effect decomposition strategy. Furthermore, it employs LLMs in a zero-shot setting to generate interpretable natural-language fairness reports. Experimental results demonstrate that this framework outperforms direct LLM-based analysis in both the accuracy of fairness quantification and the interpretability of generated reports, enabling efficient and automated diagnosis of fairness at the data level.

📝 Abstract

AutoML, intended as the process of automating the application of machine learning to real-world problems, is a key step for AI popularisation. Most AutoML frameworks are not accounting for the potential lack of fairness in the training data and in the corresponding predictions. We introduce \textsc{FairMind}, a software prototype aiming to automatise fairness analysis at the dataset level. We achieve that by resorting to the assumptions of the \emph{standard fairness model}, recently proposed by Plečko and Bareinboim. This allows for a sound fairness evaluation in terms of causal effects, based on \emph{counterfactual} queries involving the target, possibly confounders and mediators, and the different values of an input feature we regard as \emph{protected}. After the necessary data preprocessing, the tool implements a closed-form computation of the effects. LLMs are consequently exploited to generate accurate reports on the fairness levels detected in the training dataset. We achieve that in a zero-shot setup and show by examples the expected advantages with respect to a direct analysis performed by the LLM. To favour applications, extensions to ordinal protected variable and continuous targets and novel decomposition results are also discussed.

Problem

Research questions and friction points this paper is trying to address.

AutoML

fairness

causal analysis

protected variable

machine learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

causal fairness

counterfactual reasoning

AutoML