Early-stopped aggregation: Adaptive inference with computational efficiency

📅 2026-04-15

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

This work addresses the inefficiency of conventional model selection or aggregation methods, which often entail substantial computation of complex models even when the true data-generating mechanism is simple. The authors propose an Early-Stopping Aggregation (ESA) framework that achieves efficient and adaptive inference by aggregating only a small number of simple models selected via an early-stopping criterion. ESA unifies Bayesian and frequentist early-stopping mechanisms for the first time, revealing that both share a common “energy” functional composed of a goodness-of-fit term and a complexity-control term. The framework establishes adaptive optimality across multiple learning paradigms and, when integrated with variational Bayes, empirical Bayes, penalized estimation, and sample splitting techniques, attains adaptive optimal convergence rates under both statistical paradigms. Empirical results confirm its computational efficiency and superior performance.

Technology Category

Application Category

📝 Abstract

When considering a model selection or, more generally, an aggregation approach for adaptive statistical inference, it is often necessary to compute estimators over a wide range of model complexities including unnecessarily large models even when the true data-generating process is relatively simple, due to the lack of prior knowledge. This requirement can lead to substantial computational inefficiency. In this work, we propose a novel framework for efficient model aggregation called the early-stopped aggregation (ESA): instead of computing and aggregating estimators for all candidate models, we compute only a small number of simpler ones using an early-stopping criterion and aggregate only these for final inference. Our framework is versatile and applies to both Bayesian model selection, in particular, within the variational Bayes framework, and frequentist estimation, including a general penalized estimation setting. We investigate adaptive optimal property of the ESA approach across three learning paradigms. We first show that ESA achieves optimal adaptive contraction rates in the variational Bayes setting under mild conditions. We extend this result to variational empirical Bayes, where prior hyperparameters are chosen in a data-dependent manner. In addition, we apply the ESA approach to frequentist aggregation including both penalization-based and sample-splitting implementations, and establish corresponding theory. As we demonstrate, there is a clear unification between early-stopped Bayes and frequentist penalized aggregation, with a common "energy" functional comprising a data-fitting term and a complexity-control term that drives both procedures. We further present several applications and numerical studies that highlight the efficiency and strong performance of the proposed approach.

Problem

Research questions and friction points this paper is trying to address.

adaptive inference

computational efficiency

model aggregation

early stopping

statistical estimation

Innovation

Methods, ideas, or system contributions that make the work stand out.

early-stopped aggregation

adaptive inference

computational efficiency