Not All Layers Need Tuning: Selective Layer Restoration Recovers Diversity

📅 2026-02-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the mode collapse commonly observed in large language models after post-training, which severely degrades output diversity in open-ended generation. The authors propose Selective Layer Restoration (SLR), a method that reveals— for the first time—that mode collapse is localized to specific network layers. By reverting only these layers to their pre-trained weights, SLR effectively balances diversity and generation quality without incurring additional inference costs. To guide the selection of optimal layers for restoration, the study introduces a proxy task based on Constrained Random Character (CRC) generation to evaluate the trade-off between diversity and effectiveness. Experiments across mainstream models—including Llama, Qwen, and Gemma—demonstrate that SLR consistently enhances output diversity in creative writing, open-ended question answering, and multi-step reasoning tasks while preserving high-quality generation.

Technology Category

Application Category

📝 Abstract
Post-training improves instruction-following and helpfulness of large language models (LLMs) but often reduces generation diversity, which leads to repetitive outputs in open-ended settings, a phenomenon known as mode collapse. Motivated by evidence that LLM layers play distinct functional roles, we hypothesize that mode collapse can be localized to specific layers and that restoring a carefully chosen range of layers to their pre-trained weights can recover diversity while maintaining high output quality. To validate this hypothesis and decide which layers to restore, we design a proxy task -- Constrained Random Character(CRC) -- with an explicit validity set and a natural diversity objective. Results on CRC reveal a clear diversity-validity trade-off across restoration ranges and identify configurations that increase diversity with minimal quality loss. Based on these findings, we propose Selective Layer Restoration (SLR), a training-free method that restores selected layers in a post-trained model to their pre-trained weights, yielding a hybrid model with the same architecture and parameter count, incurring no additional inference cost. Across three different tasks (creative writing, open-ended question answering, and multi-step reasoning) and three different model families (Llama, Qwen, and Gemma), we find SLR can consistently and substantially improve output diversity while maintaining high output quality.
Problem

Research questions and friction points this paper is trying to address.

mode collapse
generation diversity
post-training
large language models
repetitive outputs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Selective Layer Restoration
mode collapse
diversity recovery
post-training
constrained random character
B
Bowen Zhang
Department of Computer Science, National University of Singapore, Singapore
M
Meiyi Wang
Department of Computer Science, National University of Singapore, Singapore
Harold Soh
Harold Soh
Associate Professor at National University of Singapore
Human Robot InteractionMachine LearningTactile PerceptionArtificial IntelligenceRobotics