Exposing the Illusion of Fairness: Auditing Vulnerabilities to Distributional Manipulation Attacks

📅 2025-07-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work identifies a novel data manipulation threat in AI compliance auditing: adversaries can induce minimal perturbations to the training data distribution to artificially satisfy global fairness metrics (e.g., statistical parity, equal opportunity), creating an illusion of algorithmic “false compliance.” Method: The authors introduce the first systematic framework for fairness-constrained data distribution perturbation, grounded in entropy-regularized projection and optimal transport theory, and design a corresponding statistical hypothesis testing framework for detection. Results: Experiments on benchmark tabular datasets (Adult, COMPAS) demonstrate that <0.5% sample-level perturbations suffice to deceive mainstream fairness auditors into certifying unfair models as compliant; the proposed detector achieves >92% accuracy in identifying such stealthy manipulations. The study exposes a fundamental vulnerability in audit paradigms relying solely on static global fairness metrics and provides a deployable defense, advancing trustworthy AI auditing from static metric verification toward dynamic robustness validation.

Technology Category

Application Category

📝 Abstract
Proving the compliance of AI algorithms has become an important challenge with the growing deployment of such algorithms for real-life applications. Inspecting possible biased behaviors is mandatory to satisfy the constraints of the regulations of the EU Artificial Intelligence's Act. Regulation-driven audits increasingly rely on global fairness metrics, with Disparate Impact being the most widely used. Yet such global measures depend highly on the distribution of the sample on which the measures are computed. We investigate first how to manipulate data samples to artificially satisfy fairness criteria, creating minimally perturbed datasets that remain statistically indistinguishable from the original distribution while satisfying prescribed fairness constraints. Then we study how to detect such manipulation. Our analysis (i) introduces mathematically sound methods for modifying empirical distributions under fairness constraints using entropic or optimal transport projections, (ii) examines how an auditee could potentially circumvent fairness inspections, and (iii) offers recommendations to help auditors detect such data manipulations. These results are validated through experiments on classical tabular datasets in bias detection.
Problem

Research questions and friction points this paper is trying to address.

Manipulating data samples to artificially satisfy fairness criteria
Detecting data manipulations that circumvent fairness inspections
Providing methods to modify distributions under fairness constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modify distributions using entropic projections
Detect data manipulation in fairness audits
Use optimal transport for fairness constraints
🔎 Similar Papers
No similar papers found.
V
Valentin Lafargue
Institut de Mathématiques de Toulouse, France
A
Adriana Laurindo Monteiro
Instituto Nacional de Matemática Pura e Aplicada, Brazil
E
Emmanuelle Claeys
Institut de Recherche en Informatique de Toulouse, France
Laurent Risser
Laurent Risser
CNRS - Toulouse Mathematics Institute - ANITI
XAIsurrogate modelsbias mitigation in MLimage analysis
Jean-Michel Loubes
Jean-Michel Loubes
INRIA (affiliated to Institut de Mathématiques de Toulouse) & ANITI
StatisticsMachine Learning