On the Burden of Achieving Fairness in Conformal Prediction

📅 2026-05-13

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This work addresses the issue that uniform threshold calibration in conformal prediction ignores heterogeneity in score distributions across groups, leading to coverage bias. By analyzing the aggregate score distribution under split calibration, the study reveals a fundamental trade-off between uniform and group-wise calibration. It establishes, for the first time, a conservation law and a lower bound for coverage bias, quantifying how cross-group quantile heterogeneity affects fairness. Both theoretical analysis and experiments—on synthetic and real-world data—demonstrate that calibration strategies cannot eliminate such heterogeneity; they merely determine whether the resulting bias manifests in coverage or prediction set size. This elucidates an inherent tension between coverage fairness and set-size fairness, providing a principled analytical framework for fair conformal prediction.

📝 Abstract

Conformal prediction is often calibrated with a single pooled threshold, but this can hide cross-group heterogeneity in score distributions and distort group-wise coverage. We study this phenomenon through the population score distributions underlying split conformal calibration. First, we derive a conservation law and lower bound showing that pooled calibration incurs irreducible group-wise coverage distortion at a scale set by cross-group quantile heterogeneity. Second, we demonstrate that the two leading fairness definitions for conformal prediction, Equalized Coverage and Equalized Set Size, are fundamentally in tension. Third, we quantify the cost of moving between policies which treat groups separately or pool them. Experiments on synthetic and real data confirm the same bidirectional trade-off after finite-sample calibration. Our results show that, for the policy families studied here, calibration choice does not remove cross-group heterogeneity; it determines whether the resulting distortion appears in the coverage or size dimension, providing a principled lens for analyzing fairness-oriented calibration choices in practice.

Problem

Research questions and friction points this paper is trying to address.

Conformal Prediction

Fairness

Coverage Distortion

Group Heterogeneity

Calibration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Conformal Prediction

Fairness

Equalized Coverage