🤖 AI Summary
This paper addresses the problem of verifying counterfactual fairness for probabilistic classifiers. We propose a formal method based on typed natural deduction, extending the TNDPQ calculus with causally annotated structural conditions that embed structural causal models and probabilistic reasoning into the type system. This enables rigorous logical characterization of the counterfactual proposition: “Would the decision remain unchanged if a sensitive attribute were altered?” The resulting labeled proof system supports automated derivation and formally verifiable fairness certification, overcoming key limitations of traditional statistical fairness definitions—namely, their lack of causal semantics and formal provability. Empirical evaluation demonstrates that our framework effectively detects latent counterfactual unfairness in black-box classifiers, providing an interpretable and formally grounded mechanism for fairness assurance in trustworthy AI systems. (149 words)
📝 Abstract
In this article we propose an extension to the typed natural deduction calculus TNDPQ to model verification of counterfactual fairness in probabilistic classifiers. This is obtained formulating specific structural conditions for causal labels and checking that evaluation is robust under their variation.