Robust ML Auditing using Prior Knowledge

📅 2025-05-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Machine learning decision systems may be maliciously manipulated during fairness auditing—i.e., platforms selectively alter outputs only for audit queries to pass scrutiny while retaining discriminatory behavior toward real users. Method: We propose the first regulator-prior-driven, manipulation-resistant fairness auditing framework. We formally define “prior-driven non-manipulable auditing conditions,” expose the vulnerability of public priors to strategic deception, and establish theoretical bounds for robust auditing under true data-distribution priors. Using game-theoretic modeling, probabilistic inference, and formal verification, we derive the maximum hidden unfairness degree that platforms can conceal. Results: Empirical evaluation on Adult and COMPAS datasets demonstrates significantly improved detection of covert discrimination. Our framework achieves the first provably secure fairness audit—guaranteeing audit integrity against strategic platform manipulation while preserving statistical validity.

Technology Category

Application Category

📝 Abstract
The rapid adoption of ML decision-making systems across products and services has led to a set of regulations on how such systems should behave and be built. Among all the technical challenges to enforcing these regulations, one crucial, yet under-explored problem is the risk of manipulation while these systems are being audited for fairness. This manipulation occurs when a platform deliberately alters its answers to a regulator to pass an audit without modifying its answers to other users. In this paper, we introduce a novel approach to manipulation-proof auditing by taking into account the auditor's prior knowledge of the task solved by the platform. We first demonstrate that regulators must not rely on public priors (e.g. a public dataset), as platforms could easily fool the auditor in such cases. We then formally establish the conditions under which an auditor can prevent audit manipulations using prior knowledge about the ground truth. Finally, our experiments with two standard datasets exemplify the maximum level of unfairness a platform can hide before being detected as malicious. Our formalization and generalization of manipulation-proof auditing with a prior opens up new research directions for more robust fairness audits.
Problem

Research questions and friction points this paper is trying to address.

Addressing manipulation risks in ML fairness audits
Preventing platforms from fooling auditors using prior knowledge
Establishing conditions for robust manipulation-proof auditing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses prior knowledge for manipulation-proof auditing
Avoids reliance on public datasets to prevent fooling
Formally establishes conditions for preventing audit manipulations
J
Jade Garcia Bourr'ee
Inria, Rennes, France
A
Augustin Godinot
Inria, Rennes, France
M
M. Vos
Milos Vujasinovic
Milos Vujasinovic
EPFL
Machine LearningDecentralized LearningPrivacy
S
Sayan Biswas
EPFL, Lausanne, Switzerland
G
G. Trédan
LAAS, CNRS, Toulouse, France
E
E. L. Merrer
Inria, Rennes, France
Anne-Marie Kermarrec
Anne-Marie Kermarrec
Professor, EPFL
Distributed systemssocial networksgossip protocols