LLM Data Selection and Utilization via Dynamic Bi-level Optimization

📅 2025-07-21

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

To address the inefficiency and high computational cost of static data selection in large language model (LLM) training, this paper proposes the Dynamic Weighted Mining (DWM) framework, which jointly optimizes data weights and model parameters during training. DWM employs a bilevel optimization mechanism to adaptively update data weights on a per-batch basis, thereby revealing—systematically and for the first time—the evolutionary patterns of model data preferences across training stages. Experiments demonstrate that DWM significantly outperforms random sampling baselines in final model performance. Moreover, the learned data weighting strategy exhibits strong transferability across diverse data selection methods and model scales, enabling plug-and-play enhancement of multiple training paradigms—including curriculum learning, loss-based reweighting, and mixture-of-experts scheduling. This work establishes a new paradigm for efficient, adaptive data scheduling in LLM training, advancing both empirical effectiveness and theoretical understanding of dynamic dataset optimization.

Technology Category

Application Category

📝 Abstract

While large-scale training data is fundamental for developing capable large language models (LLMs), strategically selecting high-quality data has emerged as a critical approach to enhance training efficiency and reduce computational costs. Current data selection methodologies predominantly rely on static, training-agnostic criteria, failing to account for the dynamic model training and data interactions. In this paper, we propose a new Data Weighting Model (DWM) to adjust the weight of selected data within each batch to achieve a dynamic data utilization during LLM training. Specially, to better capture the dynamic data preference of the trained model, a bi-level optimization framework is implemented to update the weighting model. Our experiments demonstrate that DWM enhances the performance of models trained with randomly-selected data, and the learned weighting model can be transferred to enhance other data selection methods and models of different sizes. Moreover, we further analyze how a model's data preferences evolve throughout training, providing new insights into the data preference of the model during training.

Problem

Research questions and friction points this paper is trying to address.

Dynamic data selection for efficient LLM training

Bi-level optimization for adaptive data weighting

Analyzing evolving data preferences in model training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic bi-level optimization for data weighting

Adjusts data weights per batch dynamically

Transfers learned weights across models

🔎 Similar Papers

Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining