Controllable Pareto Trade-off between Fairness and Accuracy

📅 2025-09-16

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

This work addresses the challenge of jointly optimizing fairness and accuracy in NLP, where these objectives often conflict. We propose Controllable Pareto Tuning (CPT), a multi-objective optimization framework enabling users to steer the fairness–accuracy trade-off via customizable preference reference vectors. CPT stabilizes gradient updates through gradient moving averaging and introduces a critical-parameter gradient pruning mechanism to enhance fidelity to user-specified preferences—enabling, for the first time, interpretable and customizable navigation of the Pareto frontier. Evaluated on hate speech detection and occupation classification, CPT yields solution sets with broader coverage and more uniform distribution along the Pareto front, while strictly adhering to user-defined fairness–accuracy ratios. It consistently outperforms existing multi-objective optimization (MOO) baselines in both objective attainment and preference alignment.

Technology Category

Application Category

📝 Abstract

The fairness-accuracy trade-off is a key challenge in NLP tasks. Current work focuses on finding a single "optimal" solution to balance the two objectives, which is limited considering the diverse solutions on the Pareto front. This work intends to provide controllable trade-offs according to the user's preference of the two objectives, which is defined as a reference vector. To achieve this goal, we apply multi-objective optimization (MOO), which can find solutions from various regions of the Pareto front. However, it is challenging to precisely control the trade-off due to the stochasticity of the training process and the high dimentional gradient vectors. Thus, we propose Controllable Pareto Trade-off (CPT) that can effectively train models to perform different trade-offs according to users' preferences. CPT 1) stabilizes the fairness update with a moving average of stochastic gradients to determine the update direction, and 2) prunes the gradients by only keeping the gradients of the critical parameters. We evaluate CPT on hate speech detection and occupation classification tasks. Experiments show that CPT can achieve a higher-quality set of solutions on the Pareto front than the baseline methods. It also exhibits better controllability and can precisely follow the human-defined reference vectors.

Problem

Research questions and friction points this paper is trying to address.

Controllable trade-off between fairness and accuracy

Overcoming stochasticity in multi-objective optimization training

Precise alignment with user-defined preference vectors

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-objective optimization for Pareto front solutions

Stabilizes fairness update with gradient moving average

Prunes gradients by keeping critical parameters only

🔎 Similar Papers

Long-Term Fairness Inquiries and Pursuits in Machine Learning: A Survey of Notions, Methods, and Challenges