A Scalable System to Prove Machine Learning Fairness in Zero-Knowledge

📅 2025-05-12

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This paper addresses the challenge of enabling model owners to publicly and verifiably demonstrate the fairness of their machine learning models without revealing sensitive model parameters. To this end, we propose a novel fairness quantification paradigm based on aggregated statistics—bypassing direct access to raw data or model weights. We derive the first tighter and more discriminative theoretical fairness bounds for both logistic regression and deep neural networks. Furthermore, we design FairZK, an efficient zero-knowledge proof system that integrates a secure spectral norm computation protocol and zk-SNARK circuits optimized for maximum, absolute value, and fixed-point arithmetic. FairZK scales to models with up to 47 million parameters, generating a single proof in just 343 seconds—achieving 3.1× to 1789× speedup over baseline approaches and operating approximately four orders of magnitude faster than existing solutions.

Technology Category

Application Category

📝 Abstract

With the rise of machine learning techniques, ensuring the fairness of decisions made by machine learning algorithms has become of great importance in critical applications. However, measuring fairness often requires full access to the model parameters, which compromises the confidentiality of the models. In this paper, we propose a solution using zero-knowledge proofs, which allows the model owner to convince the public that a machine learning model is fair while preserving the secrecy of the model. To circumvent the efficiency barrier of naively proving machine learning inferences in zero-knowledge, our key innovation is a new approach to measure fairness only with model parameters and some aggregated information of the input, but not on any specific dataset. To achieve this goal, we derive new bounds for the fairness of logistic regression and deep neural network models that are tighter and better reflecting the fairness compared to prior work. Moreover, we develop efficient zero-knowledge proof protocols for common computations involved in measuring fairness, including the spectral norm of matrices, maximum, absolute value, and fixed-point arithmetic. We have fully implemented our system, FairZK, that proves machine learning fairness in zero-knowledge. Experimental results show that FairZK is significantly faster than the naive approach and an existing scheme that use zero-knowledge inferences as a subroutine. The prover time is improved by 3.1x--1789x depending on the size of the model and the dataset. FairZK can scale to a large model with 47 million parameters for the first time, and generates a proof for its fairness in 343 seconds. This is estimated to be 4 orders of magnitude faster than existing schemes, which only scale to small models with hundreds to thousands of parameters.

Problem

Research questions and friction points this paper is trying to address.

Ensuring ML fairness without revealing model parameters

Improving efficiency of zero-knowledge proofs for fairness

Scaling fairness proofs to large ML models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses zero-knowledge proofs for fairness verification

Measures fairness without specific dataset access

Develops efficient protocols for common computations

🔎 Similar Papers

OATH: Efficient and Flexible Zero-Knowledge Proofs of End-to-End ML Fairness