Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning

📅 2025-01-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing evaluations of low-bit quantization (e.g., INT4/INT2) on large language models (LLMs) lack fine-grained analysis of its impact on mathematical reasoning—particularly the distinction between numerical computation and reasoning planning capabilities. Method: We propose the first multi-dimensional evaluation framework specifically for quantization effects on mathematical reasoning, decoupling these two core capabilities. Our approach integrates layer-wise sensitivity analysis, step-level reasoning trajectory comparison, and quantitative tracking across capability dimensions. Contribution/Results: Experiments on benchmarks such as MATH reveal that reasoning planning degrades significantly (up to −38%), whereas numerical computation remains comparatively robust. Critical vulnerability points are identified in attention intermediate layers and MLP output representations. The framework provides an interpretable, capability-aware diagnostic tool for quantization-robustness optimization, enabling targeted mitigation strategies for mathematically demanding tasks.

Technology Category

Application Category

📝 Abstract
Large language models have achieved significant advancements in complex mathematical reasoning benchmarks, such as MATH. However, their substantial computational requirements present challenges for practical deployment. Model quantization has emerged as an effective strategy to reduce memory usage and computational costs by employing lower precision and bit-width representations. In this study, we systematically evaluate the impact of quantization on mathematical reasoning tasks. We introduce a multidimensional evaluation framework that qualitatively assesses specific capability dimensions and conduct quantitative analyses on the step-by-step outputs of various quantization methods. Our results demonstrate that quantization differentially affects numerical computation and reasoning planning abilities, identifying key areas where quantized models experience performance degradation.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Model Quantization
Mathematical Reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Angle Evaluation System
Language Model Quantization
Computational Efficiency Optimization
🔎 Similar Papers
No similar papers found.