GIST: Targeted Data Selection for Instruction Tuning via Coupled Optimization Geometry

📅 2026-02-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses a key limitation in existing data selection methods for parameter-efficient fine-tuning (PEFT) approaches such as LoRA, which often neglect the off-diagonal coupling among parameters. To overcome this, the authors propose GIST, a novel method that abandons the conventional axis-aligned assumption and instead constructs a task-specific low-dimensional gradient subspace via singular value decomposition. Training gradients are projected onto this coupled subspace, and samples are scored based on the degree of directional alignment with this subspace. By explicitly capturing the intrinsic parameter coupling structure in PEFT, GIST achieves performance on par with or superior to state-of-the-art baselines under the same selection budget, while incurring only 0.29% additional storage overhead and 25% of the computational cost.

Technology Category

Application Category

📝 Abstract
Targeted data selection has emerged as a crucial paradigm for efficient instruction tuning, aiming to identify a small yet influential subset of training examples for a specific target task. In practice, influence is often measured through the effect of an example on parameter updates. To make selection scalable, many approaches leverage optimizer statistics (e.g., Adam states) as an axis-aligned surrogate for update geometry (i.e., diagonal precondition), implicitly treating parameters as coordinate-wise independent. We show that this assumption breaks down in parameter-efficient fine-tuning (PEFT) methods such as LoRA. In this setting, the induced optimization geometry exhibits strong cross-parameter coupling with non-trivial off-diagonal interactions, while the task-relevant update directions are confined to a low-dimensional subspace. Motivated by this mismatch, we propose GIST (Gradient Isometric Subspace Transformation), a simple yet principled alternative that replaces axis-aligned scaling with robust subspace alignment. GIST recovers a task-specific subspace from validation gradients via spectral filtering (SVD), projects training gradients into this coupled subspace, and scores examples by their alignment with target directions.Extensive experiments have demonstrated that GIST matches or outperforms the state-of-the-art baseline with only 0.29% of the storage and 25% of the computational time under the same selection budget.
Problem

Research questions and friction points this paper is trying to address.

instruction tuning
data selection
parameter-efficient fine-tuning
optimization geometry
LoRA
Innovation

Methods, ideas, or system contributions that make the work stand out.

GIST
instruction tuning
data selection
optimization geometry
parameter-efficient fine-tuning
🔎 Similar Papers
No similar papers found.
G
Guanghui Min
Department of Computer Science, University of Virginia, Charlottesville, USA
Tianhao Huang
Tianhao Huang
University of Virginia
LLMGraph Neural NetworkRecommender SystemAI4Science
K
Ke Wan
Department of Computer Science, University of Virginia, Charlottesville, USA
Chen Chen
Chen Chen
University of Virginia
Data MiningMachine LearningComputational Epidemiology