Preserving Fairness and Safety in Quantized LLMs Through Critical Weight Protection

📅 2026-01-17

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work investigates the adverse effects of quantization on fairness and safety in multilingual large language models, particularly under dynamic quantization and non-English contexts. While quantization reduces computational costs, it disproportionately exacerbates bias amplification and degrades safety alignment across diverse languages. The study systematically evaluates the differential impacts of static versus dynamic quantization on multilingual fairness and safety, and introduces a retraining-free critical weight protection mechanism. By identifying and preserving key parameters essential to ethical and secure behavior, the method significantly mitigates fairness and safety degradation without compromising inference efficiency. Extensive experiments across English, French, Dutch, Spanish, Turkish, Korean, and Arabic demonstrate that the proposed approach effectively balances model efficiency with trustworthiness in multilingual settings.

Technology Category

Application Category

📝 Abstract

Quantization is widely adopted to reduce the computational cost of large language models (LLMs); however, its implications for fairness and safety, particularly in dynamic quantization and multilingual contexts, remain underexplored. In this work, we conduct a systematic study of how static and dynamic quantization methods impact fairness and safety across benchmarks measuring intrinsic and extrinsic bias and safety alignment. For fairness, we evaluate English, French, Dutch, Spanish, and Turkish; for safety, we focus on English, Korean, and Arabic. Our findings reveal that quantization consistently degrades fairness and safety, with dynamic methods demonstrating greater stability than static ones. Moreover, fairness degradation varies across languages, while safety deterioration is especially pronounced in non-English settings. To address these risks, we introduce Critical Weight Protection, a novel technique that identifies and preserves fairness- and safety-critical weights during quantization. This approach effectively mitigates bias and safety deterioration without costly retraining or alignment, maintaining trustworthiness while retaining efficiency.

Problem

Research questions and friction points this paper is trying to address.

quantization

fairness

safety

large language models

multilingual

Innovation

Methods, ideas, or system contributions that make the work stand out.

quantization

fairness

safety

critical weight protection

multilingual LLMs

🔎 Similar Papers

No similar papers found.

Authors to Follow