Super Ensemble Learning Using the Highly-Adaptive-Lasso

📅 2023-12-28

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

This paper addresses nonparametric function estimation under i.i.d. data. We propose M-HAL-MLE, a super-learner framework built upon the highly adaptive lasso (HAL), which embeds HAL into a meta-learning layer and jointly optimizes over V-fold cross-validation and variation-norm constraints to automatically select the optimal ensemble mapping within the càdlàg function class. Theoretically, the estimator achieves a convergence rate of $n^{-2/3}log n$, surpassing the classical $n^{-1/2}$ barrier for conventional ensemble methods. Moreover, the target functional estimator is asymptotically linear with an influence curve that is higher-order efficient, and its excess risk is second-order. This framework substantially improves statistical efficiency and estimation accuracy, offering a new paradigm for high-dimensional nonparametric inference that simultaneously ensures strong theoretical guarantees and computational feasibility.

📝 Abstract

We consider estimation of a functional parameter of a realistically modeled data distribution based on independent and identically distributed observations. Suppose that the true function is defined as the minimizer of the expectation of a specified loss function over its parameter space. Estimators of the true function are provided, viewed as a data-adaptive coordinate transformation for the true function. For any $J$-dimensional real valued cadlag function with finite sectional variation norm, we define a candidate ensemble estimator as the mapping from the data into the composition of the cadlag function and the $J$ estimated functions. Using $V$-fold cross-validation, we define the cross-validated empirical risk of each cadlag function specific ensemble estimator. We then define the Meta Highly Adaptive Lasso Minimum Loss Estimator (M-HAL-MLE) as the cadlag function that minimizes this cross-validated empirical risk over all cadlag functions with a uniform bound on the sectional variation norm. For each of the $V$ training samples, this yields a composition of the M-HAL-MLE ensemble and the $J$ estimated functions trained on the training sample. We can estimate the true function with the average of these $V$ estimated functions, which we call the M-HAL super-learner. The M-HAL super-learner converges to the oracle estimator at a rate $n^{-2/3}$ (up till $log n$-factor) w.r.t. excess risk, where the oracle estimator minimizes the excess risk among all considered ensembles. The excess risk of the oracle estimator and true function is generally second order. Under weak conditions on the $J$ candidate estimators, target features of the undersmoothed M-HAL super-learner are asymptotically linear estimators of the corresponding target features of true function, with influence curve either the efficient influence curve, or potentially, a super-efficient influence curve.

Problem

Research questions and friction points this paper is trying to address.

Estimating functional parameters from i.i.d. observations

Selecting optimal ensemble via cross-validated risk minimization

Ensuring convergence and asymptotic linearity of target features

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensemble learning with finite-sectional-variation cadlag functions

Optimal ensemble selection via V-fold cross-validation

M-HAL super-learner averaging ensemble compositions

🔎 Similar Papers

No similar papers found.

Authors to Follow