Rethinking Layer Redundancy in Large Language Models: Calibration Objectives and Search for Depth Pruning

📅 2026-04-27
📈 Citations: 0
Influential: 0
📄 PDF

career value

214K/year
🤖 AI Summary
Existing pruning methods for large language models (LLMs) typically treat layer redundancy as an intrinsic property of pre-trained models, overlooking the critical influence of the evaluation objective on redundancy assessment. This work proposes a functional perspective, arguing that redundancy should be jointly determined by the model and the calibration objective. To investigate this, the authors construct a comprehensive experimental framework encompassing three LLM architectures, two types of calibration objectives, and seven search algorithms. Their findings reveal that different calibration objectives—such as perplexity versus downstream task accuracy—yield substantially different rankings of redundant layers, whereas under a fixed objective, diverse search algorithms produce highly consistent results. This demonstrates that the choice of calibration objective exerts a far greater impact on pruning outcomes than the selection of search algorithm.
📝 Abstract
Depth pruning improves the inference efficiency of large language models by removing Transformer blocks. Prior work has focused on importance criteria and search algorithms, often treating layer redundancy as an inherent structural property of pretrained networks. In contrast, we adopt a \emph{functional perspective}, where redundancy is jointly influenced by the model and the evaluation objective, suggesting that a universal ranking may not be sufficient. Through an empirical study across three LLM families, two calibration objectives, and seven search algorithms, we observe that different objectives yield qualitatively different redundant layers, and that perplexity and downstream accuracy rankings do not consistently align. Under a fixed objective, however, search algorithms tend to produce similar solutions. Overall, our results suggest that the calibration objective may play a more influential role than the choice of search algorithm, indicating that further attention to objective design could be beneficial.
Problem

Research questions and friction points this paper is trying to address.

layer redundancy
depth pruning
calibration objectives
large language models
redundant layers
Innovation

Methods, ideas, or system contributions that make the work stand out.

depth pruning
layer redundancy
calibration objective
large language models
functional perspective
M
Minkyu Kim
Neural Superintelligence Lab, MODULABS, Republic of Korea
V
Vincent-Daniel Yun
University of Southern California, United States
Youngrae Kim
Youngrae Kim
University of Southern California
Machine LearningComputer VisionDomain Adaptation
Y
Youngjin Heo
Neural Superintelligence Lab, MODULABS, Republic of Korea
S
Suin Cho
Boston University, United States
S
Seong-hun Kim
Neural Superintelligence Lab, MODULABS, Republic of Korea
W
Woosang Lim
Seoul National University, Republic of Korea
G
Gaeul Kwon
Neural Superintelligence Lab, MODULABS, Republic of Korea