Instruction Tuning for Large Language Models: A Survey

📅 2023-08-21
🏛️ arXiv.org
📈 Citations: 642
Influential: 22
📄 PDF
🤖 AI Summary
This study addresses the fundamental alignment gap between large language models’ (LLMs) pretraining objective—next-token prediction—and human-centric instruction-following requirements. It systematically surveys instruction tuning techniques, analyzing methodological evolution, strategies for constructing high-quality instruction-output pairs, multi-stage training paradigms, and cross-modal/domain adaptation pathways. Key determinants of generalization and controllability—such as data diversity, format consistency, and task coverage—are identified. Innovatively, the work introduces the first structured, knowledge-graph-style survey integrating theoretical foundations, practical frameworks, and critical reflection. It explicitly delineates current limitations—including instruction bias and the absence of standardized evaluation metrics—and proposes future research directions: scalable alignment, dynamic instruction synthesis, and causally grounded controllable generation. The resulting synthesis has become a benchmark reference in the LLM alignment community.
📝 Abstract
This paper surveys research works in the quickly advancing field of instruction tuning (IT), which can also be referred to as supervised fine-tuning (SFT)footnote{In this paper, unless specified otherwise, supervised fine-tuning (SFT) and instruction tuning (IT) are used interchangeably.}, a crucial technique to enhance the capabilities and controllability of large language models (LLMs). Instruction tuning refers to the process of further training LLMs on a dataset consisting of extsc{(instruction, output)} pairs in a supervised fashion, which bridges the gap between the next-word prediction objective of LLMs and the users' objective of having LLMs adhere to human instructions. In this work, we make a systematic review of the literature, including the general methodology of SFT, the construction of SFT datasets, the training of SFT models, and applications to different modalities, domains and application, along with analysis on aspects that influence the outcome of SFT (e.g., generation of instruction outputs, size of the instruction dataset, etc). We also review the potential pitfalls of SFT along with criticism against it, along with efforts pointing out current deficiencies of existing strategies and suggest some avenues for fruitful research. Project Page: github.com/xiaoya-li/Instruction-Tuning-Survey
Problem

Research questions and friction points this paper is trying to address.

Surveying instruction tuning techniques for large language models
Bridging gap between model prediction and user instruction adherence
Reviewing methodologies, datasets, and applications of supervised fine-tuning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Instruction tuning enhances LLM capabilities
Systematic review of SFT methodology and datasets
Analysis of SFT pitfalls and research avenues
🔎 Similar Papers
No similar papers found.
S
Shengyu Zhang
Zhejiang University
L
Linfeng Dong
Zhejiang University
Xiaoya Li
Xiaoya Li
University of Washington
S
Sen Zhang
Zhejiang University
Xiaofei Sun
Xiaofei Sun
Stony Brook University, Zhejiang University
Social and Information NetworkNatural Language ProcessingMachine Learning
Shuhe Wang
Shuhe Wang
Peking University, University of Melbourne
Natural Language ProcessingMachine Learning
J
Jiwei Li
Zhejiang University
Runyi Hu
Runyi Hu
Nanyang Technological University
Large Language ModelAI AlignmentWatermarking
T
Tianwei Zhang
Nanyang Technological University
F
Fei Wu
Zhejiang University
G
Guoyin Wang
Amazon