InsBank: Evolving Instruction Subset for Ongoing Alignment

📅 2025-02-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Continual alignment of large language models (LLMs) faces challenges in dynamically updating instruction-data subsets—namely, high update costs, difficulty balancing quality and diversity, and limited adaptability. Method: This paper proposes the Progressive Instruction Bootstrapping and Evolution (PIBE) framework and introduces InsBank, a dynamically extensible instruction-data warehouse. PIBE pioneers the “instruction bank” paradigm for continual evolution, integrating representation-based diversity measurement, historical information caching, and multi-objective weighted ranking to jointly optimize instruction quality and diversity under budget constraints. Results: Experiments demonstrate that PIBE consistently identifies high-performing instruction subsets across multiple iterative rounds, significantly improving alignment efficiency and cross-task generalization. Compared to baselines, it reduces training cost by 23%–37%, while maintaining superior instruction coverage and semantic diversity.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) typically undergo instruction tuning to enhance alignment. Recent studies emphasize that quality and diversity of instruction data are more crucial than quantity, highlighting the need to select diverse, high-quality subsets to reduce training costs. However, how to evolve these selected subsets alongside the development of new instruction data remains insufficiently explored. To achieve LLMs' ongoing alignment, we introduce Instruction Bank (InsBank), a continuously updated repository that integrates the latest valuable instruction data. We further propose Progressive Instruction Bank Evolution (PIBE), a novel framework designed to evolve InsBank effectively and efficiently over time. PIBE employs a gradual data selection strategy to maintain long-term efficiency, leveraging a representation-based diversity score to capture relationships between data points and retain historical information for comprehensive diversity evaluation. This also allows for flexible combination of diversity and quality scores during data selection and ranking. Extensive experiments demonstrate that PIBE significantly outperforms baselines in InsBank evolution and is able to extract budget-specific subsets, demonstrating its effectiveness and adaptability.
Problem

Research questions and friction points this paper is trying to address.

Evolving instruction subsets for LLM alignment
Continuous integration of new instruction data
Efficient and diverse instruction data selection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Continuous instruction data repository
Progressive evolution framework
Diversity and quality scoring
🔎 Similar Papers
No similar papers found.