From Signal Degradation to Computation Collapse: Uncovering the Two Failure Modes of LLM Quantization

📅 2026-04-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

206K/year
🤖 AI Summary
This study addresses the severe performance degradation commonly observed in post-training quantization of large language models to ultra-low bit-widths (e.g., 2-bit), a phenomenon whose underlying mechanisms remain poorly understood. The work systematically identifies two fundamental failure modes—signal degradation and computational collapse—and introduces the first diagnostic framework capable of distinguishing between them. Building on this mechanistic analysis, the authors propose a training-free, targeted repair strategy informed by signal propagation tracing and component-wise functional assessment. Experiments demonstrate that signal degradation can be effectively mitigated through precise interventions, whereas computational collapse necessitates architectural restructuring, rendering existing compensation methods ineffective. This research provides both theoretical insights and practical pathways for advancing ultra-low-bit quantization of large language models.

Technology Category

Application Category

📝 Abstract
Post-Training Quantization (PTQ) is critical for the efficient deployment of Large Language Models (LLMs). While 4-bit quantization is widely regarded as an optimal trade-off, reducing the precision to 2-bit usually triggers a catastrophic ``performance cliff.'' It remains unclear whether the underlying mechanisms differ fundamentally. Consequently, we conduct a systematic mechanistic analysis, revealing two qualitatively distinct failure modes: Signal Degradation, where the computational patterns remain intact but information precision is impaired by cumulative error; and Computation Collapse, where key components fail to function, preventing correct information processing and destroying the signal in the early layers. Guided by this diagnosis, we conduct mechanism-aware interventions, demonstrating that targeted, training-free repair can mitigate Signal Degradation, but remains ineffective for Computation Collapse. Our findings provide a systematic diagnostic framework for PTQ failures and suggest that addressing Computation Collapse requires structural reconstruction rather than mere compensation.
Problem

Research questions and friction points this paper is trying to address.

Post-Training Quantization
LLM Quantization
Signal Degradation
Computation Collapse
Performance Cliff
Innovation

Methods, ideas, or system contributions that make the work stand out.

Post-Training Quantization
Signal Degradation
Computation Collapse
Large Language Models
Quantization Failure Modes
🔎 Similar Papers
C
Chenxi Zhou
School of Advanced Interdisciplinary Sciences, University of Chinese Academy of Sciences; The Key Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences
Pengfei Cao
Pengfei Cao
Institute of Automation, Chinese Academy of Sciences
Natural Language ProcessingLarge Language ModelsInformation Extraction
J
Jiang Li
College of Computer Science, Inner Mongolia University
B
Bohan Yu
School of Advanced Interdisciplinary Sciences, University of Chinese Academy of Sciences; The Key Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences
J
Jinyu Ye
The Key Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences
Jun Zhao
Jun Zhao
School of Marine Sciences, Sun Yat-sen University
ocean opticsremote sensingnumerical modeling
K
Kang Liu
The Key Laboratory of Cognition and Decision Intelligence for Complex Systems, Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences