🤖 AI Summary
High-risk AI decisions are vulnerable to structural bias, leading to proxy discrimination and unfair outcomes; however, existing auditing methods struggle to pinpoint its root causes. This paper introduces the first framework integrating formal abductive explanation with capability-aligned mapping: by modeling domain-specific background knowledge and cross-group capability alignment functions, it identifies features that illegitimately serve as proxies for protected attributes, enabling substantive, individual-level attribution of proxy discrimination. The approach transcends conventional fairness auditing by providing, for the first time, an interpretable and attributable fairness diagnosis grounded in structural bias. Empirical evaluation on the German Credit dataset demonstrates that the framework effectively uncovers latent proxy mechanisms, significantly enhancing transparency and accountability in AI decision-making.
📝 Abstract
Artificial intelligence (AI) systems in high-stakes domains raise concerns about proxy discrimination, unfairness, and explainability. Existing audits often fail to reveal why unfairness arises, particularly when rooted in structural bias. We propose a novel framework using formal abductive explanations to explain proxy discrimination in individual AI decisions. Leveraging background knowledge, our method identifies which features act as unjustified proxies for protected attributes, revealing hidden structural biases. Central to our approach is the concept of aptitude, a task-relevant property independent of group membership, with a mapping function aligning individuals of equivalent aptitude across groups to assess fairness substantively. As a proof of concept, we showcase the framework with examples taken from the German credit dataset, demonstrating its applicability in real-world cases.