GRACE: A Dynamic Coreset Selection Framework for Large Language Model Optimization

📅 2026-04-09

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Training large language models is computationally expensive, and existing static or non-scalable coreset methods struggle to accommodate their dynamic training processes. To address this challenge, this work proposes GRACE, a novel framework that integrates gradient-based importance with representation diversity through a k-nearest neighbor graph-guided dynamic coreset selection mechanism. This approach enables efficient information retention and computational optimization by supporting dynamic embedding updates and selective sampling. Evaluated across three benchmark datasets, GRACE significantly enhances both training efficiency and downstream task performance, demonstrating broad applicability across diverse large language models.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities in natural language understanding and generation. However, their immense number of parameters and complex transformer-based architectures result in significant resource demands and computational complexity during training, making it challenging to optimize them efficiently on large datasets. To reduce training costs while preserving performance, researchers have investigated coreset selection techniques, which aim to identify small, representative subsets of the entire training dataset to accelerate LLM training. However, existing coreset selection methods fail to adapt to the dynamic nature of LLM training and often struggle with scalability for models of this size. To address these limitations, we propose a graph-guided adaptive and dynamic coreset selection framework for LLMs, namely GRACE. GRACE dynamically constructs and updates coresets by combining representation diversity with gradient-based importance metrics, ensuring both informativeness and efficiency. To mitigate the computational cost of frequent updates, GRACE leverages a $k$-NN graph-based propagation mechanism and selectively updates scores and embeddings, adapting to evolving training dynamics. Extensive experiments on three benchmarks demonstrate that GRACE significantly improves training efficiency and downstream performance across diverse LLMs and tasks.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

coreset selection

training efficiency

dynamic adaptation

scalability

Innovation

Methods, ideas, or system contributions that make the work stand out.

coreset selection

dynamic adaptation

graph-based propagation