Efficient Layer-wise LLM Fine-tuning for Revision Intention Prediction

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Fine-grained text revision intent prediction poses two key challenges for large language models (LLMs): (i) direct instruction tuning struggles to capture subtle textual distinctions, and (ii) full-parameter fine-tuning relies heavily on scarce, labor-intensive annotated data. Method: We propose a hierarchical parameter-efficient fine-tuning (PEFT) framework featuring a novel gradient-norm-based dynamic critical layer selection mechanism. This method freezes redundant layers while selectively optimizing only those network layers most sensitive to revision identification. Contribution/Results: The approach significantly enhances discriminative capability under low-data regimes. Experiments demonstrate consistent superiority over existing hierarchical fine-tuning baselines across multi-class revision classification tasks. It achieves faster convergence, reduced GPU memory footprint, and improved generalization—without compromising accuracy—thereby offering an efficient, scalable solution for fine-grained revision intent modeling.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have shown extraordinary success across various text generation tasks; however, their potential for simple yet essential text classification remains underexplored, as LLM pre-training tends to emphasize generation over classification. While LLMs with instruction tuning can transform classification into a generation task, they often struggle to categorize nuanced texts. One such example is text revision, which involves nuanced edits between pairs of texts. Although simply fine-tuning LLMs for revision classification seems plausible, it requires a large amount of revision annotations, which are exceptionally expensive and scarce in the community. To address this issue, we introduce a plug-and-play layer-wise parameter-efficient fine-tuning (PEFT) framework, i.e., IR-Tuning, which fine-tunes a subset of important LLM layers that are dynamically selected based on their gradient norm distribution, while freezing those of redundant layers. Extensive experiments suggest that IR-Tuning surpasses several layer-wise PEFT baselines over diverse text revisions, while achieving fast convergence, low GPU memory consumption, and effectiveness on small revision corpora.

Problem

Research questions and friction points this paper is trying to address.

Fine-tuning LLMs for nuanced text revision classification tasks

Addressing expensive annotation requirements for text revision datasets

Optimizing parameter-efficient fine-tuning for limited computational resources

Innovation

Methods, ideas, or system contributions that make the work stand out.

Layer-wise fine-tuning of important LLM layers

Dynamic layer selection via gradient norm distribution

Parameter-efficient framework for small revision corpora

🔎 Similar Papers

The Remarkable Robustness of LLMs: Stages of Inference?