Ghosted Layers: Unconstrained Activation Alignment for Recovering Layer-Pruned LLMs

πŸ“… 2026-05-15
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

183K/year
πŸ€– AI Summary
This work addresses the performance degradation of large language models after layer pruning, which arises from distribution shifts in the hidden states fed into surviving layers. To mitigate this issue without additional training, the authors propose Ghosted Layersβ€”a calibration-based recovery module that leverages a small calibration set to compute, in closed form, a globally optimal linear operator that explicitly aligns the activation discrepancies introduced by pruning. This approach achieves unconstrained alignment of activation distributions for the first time, overcoming the limitations of existing methods confined to restricted operator spaces. Experimental results demonstrate that Ghosted Layers consistently outperforms training-free baselines across diverse models and pruning strategies, effectively improving accuracy and reducing perplexity while preserving the efficiency gains of pruning.
πŸ“ Abstract
Layer pruning removes entire Transformer decoder blocks from large language models, but introduces a mismatch between the hidden state received by the next surviving layer and the distribution it was trained to process, leading to significant performance degradation. We propose Ghosted Layers, a training-free recovery module that addresses this issue by solving a boundary activation alignment problem. Our method derives a closed-form optimal linear operator from a small calibration set to reconstruct the activation discrepancy introduced by the pruned layers. We show that this solution corresponds to the unconstrained optimum of the alignment objective, whereas existing methods are restricted to constrained solutions over limited operator subspaces. Experiments across multiple LLM backbones and pruning strategies demonstrate that our method consistently improves accuracy and perplexity over prior training-free baselines, while preserving the efficiency gains of layer pruning.
Problem

Research questions and friction points this paper is trying to address.

layer pruning
activation alignment
large language models
performance degradation
hidden state mismatch
Innovation

Methods, ideas, or system contributions that make the work stand out.

layer pruning
activation alignment
training-free recovery
closed-form solution
large language models