RCPU: Rotation-Constrained Error Compensation for Structured Pruning of a Large Language Model

📅 2025-10-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address output mismatch and least-squares compensation overfitting caused by sparse calibration data in structured pruning of large language models (LLMs), this paper proposes a rotation-constrained error compensation method. The method orthogonally rotates the pruned weights to align the post-pruning subspace with the original while preserving output representation norms and pairwise inner products. Concurrently, it introduces a variance-aware importance scoring mechanism that prioritizes input-driven critical dimensions for retention. Joint optimization of these components enhances geometric consistency and reconstruction fidelity. Experiments on LLaMA-7B demonstrate a 12.3% reduction in perplexity on WikiText-2 and average accuracy gains of 2.1%–4.7% across multiple language understanding benchmarks—substantially outperforming state-of-the-art baselines.

Technology Category

Application Category

📝 Abstract
In this paper, we propose a rotation-constrained compensation method to address the errors introduced by structured pruning of large language models (LLMs). LLMs are trained on massive datasets and accumulate rich semantic knowledge in their representation space. In contrast, pruning is typically carried out with only a small amount of calibration data, which makes output mismatches unavoidable. Although direct least-squares fitting can reduce such errors, it tends to overfit to the limited calibration set, destructively modifying pretrained weights. To overcome this difficulty, we update the pruned parameters under a rotation constraint. This constrained update preserves the geometry of output representations (i.e., norms and inner products) and simultaneously re-aligns the pruned subspace with the original outputs. Furthermore, in rotation-constrained compensation, removing components that strongly contribute to the principal directions of the output makes error recovery difficult. Since input dimensions with large variance strongly affect these principal directions, we design a variance-aware importance score that ensures such dimensions are preferentially kept in the pruned model. By combining this scoring rule with rotation-constrained updates, the proposed method effectively compensates errors while retaining the components likely to be more important in a geometry-preserving manner. In the experiments, we apply the proposed method to LLaMA-7B and evaluate it on WikiText-2 and multiple language understanding benchmarks. The results demonstrate consistently better perplexity and task accuracy compared with existing baselines.
Problem

Research questions and friction points this paper is trying to address.

Addresses output errors from structured pruning of large language models
Prevents destructive weight modification during error compensation process
Retains important components while preserving representation geometry
Innovation

Methods, ideas, or system contributions that make the work stand out.

Rotation-constrained compensation for pruning errors
Variance-aware importance scoring for dimension retention
Geometry-preserving weight updates with calibration data
🔎 Similar Papers
No similar papers found.
Shuichiro Haruta
Shuichiro Haruta
KDDI Research, Inc.
Deep LearningRecommender SystemLLM
K
Kazunori Matsumoto
AI Division, KDDI Research, Inc. 2-1-15 Ohara, Fujimino-shi, Saitama, 356-8502, Japan
Z
Zhi Li
AI Division, KDDI Research, Inc. 2-1-15 Ohara, Fujimino-shi, Saitama, 356-8502, Japan
Y
Yanan Wang
AI Division, KDDI Research, Inc. 2-1-15 Ohara, Fujimino-shi, Saitama, 356-8502, Japan
Mori Kurokawa
Mori Kurokawa
KDDI Research, Inc.
Machine Learning