🤖 AI Summary
This work addresses the challenge of jointly optimizing fairness and accuracy in NLP, where these objectives often conflict. We propose Controllable Pareto Tuning (CPT), a multi-objective optimization framework enabling users to steer the fairness–accuracy trade-off via customizable preference reference vectors. CPT stabilizes gradient updates through gradient moving averaging and introduces a critical-parameter gradient pruning mechanism to enhance fidelity to user-specified preferences—enabling, for the first time, interpretable and customizable navigation of the Pareto frontier. Evaluated on hate speech detection and occupation classification, CPT yields solution sets with broader coverage and more uniform distribution along the Pareto front, while strictly adhering to user-defined fairness–accuracy ratios. It consistently outperforms existing multi-objective optimization (MOO) baselines in both objective attainment and preference alignment.
📝 Abstract
The fairness-accuracy trade-off is a key challenge in NLP tasks. Current work focuses on finding a single "optimal" solution to balance the two objectives, which is limited considering the diverse solutions on the Pareto front. This work intends to provide controllable trade-offs according to the user's preference of the two objectives, which is defined as a reference vector. To achieve this goal, we apply multi-objective optimization (MOO), which can find solutions from various regions of the Pareto front. However, it is challenging to precisely control the trade-off due to the stochasticity of the training process and the high dimentional gradient vectors. Thus, we propose Controllable Pareto Trade-off (CPT) that can effectively train models to perform different trade-offs according to users' preferences. CPT 1) stabilizes the fairness update with a moving average of stochastic gradients to determine the update direction, and 2) prunes the gradients by only keeping the gradients of the critical parameters. We evaluate CPT on hate speech detection and occupation classification tasks. Experiments show that CPT can achieve a higher-quality set of solutions on the Pareto front than the baseline methods. It also exhibits better controllability and can precisely follow the human-defined reference vectors.