A Semantic-Aware Layer-Freezing Approach to Computation-Efficient Fine-Tuning of Language Models

📅 2024-06-17

📈 Citations: 2

✨ Influential: 0

career value

173K/year

🤖 AI Summary

To address the prohibitively high computational cost of full-parameter fine-tuning (FPT) for large language models (LLMs), this paper proposes a semantics-aware layer freezing strategy. It is the first to model each layer’s contribution to loss reduction from the perspective of semantic evolution in hidden representations. By analyzing hidden-space transition trajectories and layer-wise bias sensitivity, and integrating scaling laws, the method derives a layer-wise gain estimation that enables interpretable, dynamic selection of trainable layers (“where to fine-tune”). Crucially, backward propagation is omitted for frozen layers, substantially reducing training memory and FLOPs—up to 62% savings. Extensive experiments across multiple LLMs and downstream datasets demonstrate that the approach maintains or even surpasses the performance of both FPT and leading parameter-efficient fine-tuning (PEFT) methods. The core innovation lies in unifying semantic evolution modeling with scaling-law-driven gain quantification, providing both theoretical grounding and a practical framework for efficient LLM adaptation.

Technology Category

Application Category

📝 Abstract

Finetuning language models (LMs) is crucial for adapting the models to downstream data and tasks. However, full finetuning is usually costly. Existing work, such as parameter-efficient finetuning (PEFT), often focuses on extit{how to finetune} but neglects the issue of extit{where to finetune}. As a pioneering work on reducing the cost of backpropagation (at the layer level) by answering where to finetune, we conduct a semantic analysis of the LM inference process. We first propose using transition traces of the latent representation to compute deviations (or loss). Then, using a derived formula of scaling law, we estimate the gain of each layer in reducing deviation (or loss). Further, we narrow down the scope for finetuning, and also, study the cost-benefit balance of LM finetuning. We perform extensive experiments across well-known LMs and datasets. The results show that our approach is effective and efficient, and outperforms the existing baselines. Our approach is orthogonal to other techniques on improving finetuning efficiency, such as PEFT methods, offering practical values on LM finetuning.

Problem

Research questions and friction points this paper is trying to address.

Reduces backpropagation cost in language models

Identifies optimal layers for efficient finetuning

Balances cost-benefit in model finetuning process

Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic-aware layer-freezing method

Transition traces for loss computation

Scaling law for layer selection

🔎 Similar Papers

No similar papers found.

Cerebras Systems

Sunnyvale CA or Toronto Canada / Headquarters/Sunnyvale Office, Sunnyvale, CA / Toronto Office, Toronto, Ontario, Canada

PhD GenAI Research Scientist Intern

Databricks

SF Bay Area Hourly Rate$54—$60 USD

San Francisco, CA, USA

Authors to Follow