QLESS: A Quantized Approach for Data Valuation and Selection in Large Language Model Fine-Tuning

📅 2025-02-03

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

To address memory bottlenecks in large language model (LLM) fine-tuning caused by massive datasets, this paper proposes Quant-LESS: an efficient data evaluation and selection framework tailored for memory-constrained settings. Methodologically, it introduces gradient quantization into the LESS framework for the first time, integrating LoRA-based random projection, 1-bit gradient quantization, and low-rank similarity search. This design significantly reduces memory overhead while preserving high fidelity in data value estimation. Experiments on LLaMA, Mistral, and Qwen demonstrate that Quant-LESS matches the performance of original LESS on benchmarks including MMLU, BBH, and TyDiQA, achieves up to 16× lower memory consumption, and maintains lossless data selection quality under 1-bit quantization. The core contribution is a high-fidelity, ultra-low-memory paradigm for subset selection—enabling scalable, resource-efficient LLM fine-tuning without compromising evaluation accuracy.

Technology Category

Application Category

📝 Abstract

Fine-tuning large language models (LLMs) is often constrained by the computational costs of processing massive datasets. We propose extbf{QLESS} (Quantized Low-rank Gradient Similarity Search), which integrates gradient quantization with the LESS framework to enable memory-efficient data valuation and selection. QLESS employs a two-step compression process: first, it obtains low-dimensional gradient representations through LoRA-based random projection; then, it quantizes these gradients to low-bitwidth representations. Experiments on multiple LLM architectures (LLaMA, Mistral, Qwen) and benchmarks (MMLU, BBH, TyDiQA) show that QLESS achieves comparable data selection performance to LESS while reducing memory usage by up to 16x. Even 1-bit gradient quantization preserves data valuation quality. These findings underscore QLESS as a practical, scalable approach to identifying informative examples within strict memory constraints.

Problem

Research questions and friction points this paper is trying to address.

Reduces memory usage in LLM fine-tuning

Quantizes gradients for efficient data selection

Maintains data valuation quality with low-bitwidth

Innovation

Methods, ideas, or system contributions that make the work stand out.

Quantized Low-rank Gradient Search

Memory-efficient data selection

1-bit gradient quantization

🔎 Similar Papers

Take the essence and discard the dross: A Rethinking on Data Selection for Fine-Tuning Large Language Models