🤖 AI Summary
This work addresses the amplification of biases from training data during the inference phase of generative AI, a challenge inadequately mitigated by existing post-hoc methods due to their lack of formal guarantees. To tackle this issue, the paper introduces a novel branching temporal logic, CTLF, which incorporates counting-world semantics for the first time and integrates modal operators with probabilistic verification techniques. This framework enables formal modeling, evaluation, and correction of fairness biases in generated sequences. It supports verifiable analysis of fairness properties across multi-stage outputs with respect to protected attributes and demonstrates both expressive modeling capacity and practical efficacy through successful validation on biased image generation tasks.
📝 Abstract
Generative AI systems are known to amplify biases present in their training data. While several inference-time mitigation strategies have been proposed, they remain largely empirical and lack formal guarantees. In this paper we introduce CTLF, a branching-time logic designed to reason about bias in series of generative AI outputs. CTLF adopts a counting worlds semantics where each world represents a possible output at a given step in the generation process and introduces modal operators that allow us to verify whether the current output series respects an intended probability distribution over a protected attribute, to predict the likelihood of remaining within acceptable bounds as new outputs are generated, and to determine how many outputs are needed to remove in order to restore fairness. We illustrate the framework on a toy example of biased image generation, showing how CTLF formulas can express concrete fairness properties at different points in the output series.