Importance-Guided Basis Selection for Low-Rank Decomposition of Large Language Models

📅 2026-05-02

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

This work addresses the limitations of existing low-rank compression methods, which often neglect the local geometric structure of the task loss function and consequently fail to effectively identify critical singular vector bases. To overcome this, we propose the Basis Selection Importance (BSI) framework, which introduces second-order Taylor expansion into low-rank basis pruning for the first time. BSI estimates the impact of removing each basis on task loss by integrating first-order gradient sensitivity with second-order Hessian curvature information. We develop an efficient diagonal Hessian estimator based on Hutchinson’s stochastic probing combined with symmetric parameter perturbation, enabling scalable compression of large models while providing theoretical guarantees. Experimental results demonstrate that BSI significantly outperforms current approaches on mathematical reasoning benchmarks, with particularly pronounced advantages under aggressive compression settings.

📝 Abstract

Low-rank decomposition is a compelling approach for compressing large language models, but its effectiveness hinges on selecting which singular-vector bases to retain for a target task. Existing methods such as Basel adapt singular-value coefficients on downstream data and prune bases with small re-learned magnitudes, a heuristic that can be misaligned with task performance because it ignores the local geometry of the loss landscape. We present Basis Selection with Importance (BSI), a principled low-rank compression framework that ranks and prunes bases by directly estimating the expected loss increase incurred when each basis is removed. BSI derives a derivative-based importance score from a second-order Taylor expansion of the task loss with respect to singular values, combining first-order sensitivity and second-order curvature to quantify pruning impact. To make this criterion practical for LLMs, we develop an efficient Hessian-diagonal estimator by adapting the Hutchinson randomized-probing method to loss curvature with symmetric parameter perturbations. We provide a comprehensive theoretical analysis, including loss-increase bounds under basis pruning, explicit propagation of Hessian-diagonal estimation error into these bounds, variance characterization tied to the Hessian spectrum, high-probability sample-complexity guarantees for achieving a target estimation accuracy, and guidance on perturbation intensity. Extensive experiments on mathematical reasoning benchmarks demonstrate that BSI consistently outperforms state-of-the-art low-rank decomposition baselines, with especially strong improvements under deep compression.

Problem

Research questions and friction points this paper is trying to address.

low-rank decomposition

basis selection

large language models

model compression

singular-vector bases

Innovation

Methods, ideas, or system contributions that make the work stand out.

low-rank decomposition

importance estimation

Hessian-diagonal approximation