Fair-GPTQ: Bias-Aware Quantization for Large Language Models

📅 2025-09-18

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

While quantization of large language models (LLMs) reduces memory and computational overhead, existing methods—such as GPTQ—exacerbate societal biases (e.g., gender, racial, and religious stereotypes) in occupational associations and discriminatory outputs, with no interpretable analysis of bias origins. Method: We propose the first fairness-aware LLM quantization framework that explicitly incorporates group fairness as an optimization objective within GPTQ, introducing a fairness regularization term and a bias-aware 4-bit rounding strategy that jointly minimizes input-weight product reconstruction error and bias amplification. Via channel- and weight-level attribution, we identify fairness-sensitive parameter structures. Results: Our method maintains >90% zero-shot accuracy while achieving significantly lower unfairness than standard 4-bit baselines, outperforming FP16 models in fairness metrics, and matching the performance of iterative nullspace projection debiasing on racial stereotype benchmarks.

Technology Category

Application Category

📝 Abstract

High memory demands of generative language models have drawn attention to quantization, which reduces computational cost, memory usage, and latency by mapping model weights to lower-precision integers. Approaches such as GPTQ effectively minimize input-weight product errors during quantization; however, recent empirical studies show that they can increase biased outputs and degrade performance on fairness benchmarks, and it remains unclear which specific weights cause this issue. In this work, we draw new links between quantization and model fairness by adding explicit group-fairness constraints to the quantization objective and introduce Fair-GPTQ, the first quantization method explicitly designed to reduce unfairness in large language models. The added constraints guide the learning of the rounding operation toward less-biased text generation for protected groups. Specifically, we focus on stereotype generation involving occupational bias and discriminatory language spanning gender, race, and religion. Fair-GPTQ has minimal impact on performance, preserving at least 90% of baseline accuracy on zero-shot benchmarks, reduces unfairness relative to a half-precision model, and retains the memory and speed benefits of 4-bit quantization. We also compare the performance of Fair-GPTQ with existing debiasing methods and find that it achieves performance on par with the iterative null-space projection debiasing approach on racial-stereotype benchmarks. Overall, the results validate our theoretical solution to the quantization problem with a group-bias term, highlight its applicability for reducing group bias at quantization time in generative models, and demonstrate that our approach can further be used to analyze channel- and weight-level contributions to fairness during quantization.

Problem

Research questions and friction points this paper is trying to address.

Reducing biased outputs in quantized large language models

Addressing fairness degradation during weight quantization process

Identifying specific weights causing unfairness in compressed models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Group-fairness constraints in quantization objective

Guiding rounding operation for less-biased generation

Maintaining performance while reducing bias in 4-bit quantization

🔎 Similar Papers

From Prejudice to Parity: A New Approach to Debiasing Large Language Model Word Embeddings