Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Boostrapping

📅 2025-01-31

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This work addresses the cross-iteration optimal allocation of generation and training budgets in iterative synthetic data bootstrapping to maximize final model performance. We propose the first theoretical framework for budget allocation, rigorously proving that exponential growth—allocating increasingly larger budgets per iteration—achieves superior convergence rates and higher asymptotic performance bounds compared to constant-budget allocation. Our method integrates diffusion models (for image denoising), large language models (for mathematical reasoning), synthetic data filtering, and iterative fine-tuning. Empirical evaluation across two distinct task domains demonstrates that the exponential strategy consistently improves model accuracy by +2.1–4.7%, yields more stable convergence, and effectively overcomes performance plateaus inherent to constant allocation. The core contributions are: (i) a formal convergence analysis establishing budget allocation as a theoretically grounded optimization problem, and (ii) empirical validation identifying exponential budget growth as a superior practical paradigm for iterative synthetic data bootstrapping.

Technology Category

Application Category

📝 Abstract

Modern foundation models often undergo iterative ``bootstrapping'' in their post-training phase: a model generates synthetic data, an external verifier filters out low-quality samples, and the high-quality subset is used for further fine-tuning. Over multiple iterations, the model's performance improves--raising a crucial question: how should the total budget on generation and training be allocated across iterations to maximize final performance? In this work, we develop a theoretical framework to analyze budget allocation strategies. Specifically, we show that constant policies fail to converge with high probability, while increasing policies--particularly exponential growth policies--exhibit significant theoretical advantages. Experiments on image denoising with diffusion probabilistic models and math reasoning with large language models show that both exponential and polynomial growth policies consistently outperform constant policies, with exponential policies often providing more stable performance.

Problem

Research questions and friction points this paper is trying to address.

Budget Allocation

Model Training

Performance Optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Budget Allocation Strategies

Exponential Growth Policies

Foundation Models Bootstrapping

🔎 Similar Papers

Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective