Active Budget Allocation for Efficient Scaling Law Estimation via Surrogate-Guided Pruning

📅 2026-05-16
📈 Citations: 0
Influential: 0
📄 PDF

career value

250K/year
🤖 AI Summary
Traditional scaling law estimation suffers from high computational costs due to the absence of efficient budget allocation strategies. This work proposes a novel approach that, for the first time, integrates surrogate-guided pruning into scaling law modeling by combining the Successive Halving algorithm with both parametric and non-parametric surrogate models. This integration enables proactive allocation of computational resources and efficient construction of loss-compute Pareto frontiers. The method substantially improves resource utilization efficiency, achieving relative performance gains of up to 2.84% on real datasets and 5.47% on synthetic datasets, while reducing computational costs by as much as 98.7%.
📝 Abstract
Predicting model performance at larger scales enables the design of training strategies and architectures tailored to specific performance targets. Empirical scaling law research identifies functional forms to aid this prediction task. These describe the relationship between loss and compute using a loss-compute frontier defined by learning curves. Due to the empirical nature of this approach, the computational burden is substantial, making strategic resource allocation essential - yet it remains surprisingly underexplored. In this work, we address this shortcoming by exploring the suitability of Successive Halving (SH) and SH combined with parametric and non-parametric surrogate models. In addition to enabling a more systematic allocation of a given compute budget, our findings show that SH paired with surrogate models yields a set of learning curves that includes one with a lower loss-compute value than what naive uniform allocation or an SH-only approach can obtain. Our experiments demonstrate mean relative improvements of up to 2.84% and 5.47% on real-world and synthetic learning curve datasets. This strategic resource allocation enables us to obtain accurate scaling laws at significantly reduced computational costs, saving up to 98.7% over the traditional exhaustive approach.
Problem

Research questions and friction points this paper is trying to address.

scaling laws
compute budget allocation
learning curves
efficient estimation
resource allocation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Successive Halving
surrogate models
scaling laws
budget allocation
compute-efficient training
🔎 Similar Papers
No similar papers found.