🤖 AI Summary
This work addresses the limitations of traditional static classifiers, which ignore historical decisions and thus fail to ensure runtime fairness, as well as existing defense mechanisms that rely on deterministic interventions and struggle to balance short-term safety with long-term adaptivity. To overcome these challenges, the paper introduces an “energy shield” mechanism—a novel approach inspired by physical energy functions—that employs a lightweight adaptive controller and a probabilistic intervention strategy to dynamically and smoothly steer decision sequences toward fairness. The proposed method simultaneously guarantees short-term safety, by maintaining fairness metrics within a target interval with high probability, and long-term activity, by ensuring the limiting values of fairness metrics converge to the desired range. Experimental results demonstrate that, compared to existing methods, the energy shield achieves superior performance in both short- and long-term fairness while requiring fewer interventions.
📝 Abstract
Runtime fairness is not a one-time constraint but a dynamic property evaluated over a sequence of decisions.
To ensure fairness at runtime, it is necessary to account for past decisions, information neglected by conventional, static classifiers.
Traditional fairness shields enforce runtime fairness abruptly,
by intervening \emph{deterministically} whenever a sequence of decisions violates the target for a running fairness measure. This motivates our \emph{main conceptual contribution: \textbf{energy shields}.}
An energy shield is a novel, lightweight, adaptive controller that monitors a sequence of decisions and intervenes \emph{probabilistically} to ensure runtime fairness smoothly, by utilizing physics-inspired energy functions to nudge the sequence toward fairness:
the more unfair the decisions, the stronger the nudging force becomes. This makes energy shields the \emph{\textbf{first}} fairness shields to provide both \emph{short-term safety and long-term liveness guarantees}.
Safety ensures that the running fairness measure stays within a running target interval with high probability,
and liveness ensures that the limit of the fairness measure lies within the limit target interval.
Intuitively, the short-term specifies the tolerated fairness values and the long-term specifies the desired fairness values.
We also provide a synthesis procedure for constructing the least intrusive energy shield for a given target specification, and demonstrate its efficiency experimentally.
We evaluate our energy shields against existing fairness shields through the lens of short- and long-term fairness.