Stability Regularized Cross-Validation

📅 2025-05-11

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Traditional k-fold cross-validation often yields overly optimistic validation performance but poor generalization for inherently unstable yet interpretable models—such as sparse regression and CART—due to their sensitivity to training data perturbations. Method: We propose a nested k-fold cross-validation framework that explicitly incorporates an empirical stability regularizer based on prediction perturbation during hyperparameter selection. A doubly nested structure adaptively learns stability-aware weights, integrating stability constraints directly into the hyperparameter optimization pipeline. Contribution/Results: This is the first work to systematically embed stability regularization into hyperparameter tuning while preserving predictive accuracy. Experiments across 13 UCI datasets show that our method reduces out-of-sample mean squared error by 4% on average for sparse ridge regression and CART, with no significant performance change for stable models like XGBoost—demonstrating both efficacy and specificity in enhancing generalization for unstable, interpretable models.

Technology Category

Application Category

📝 Abstract

We revisit the problem of ensuring strong test-set performance via cross-validation. Motivated by the generalization theory literature, we propose a nested k-fold cross-validation scheme that selects hyperparameters by minimizing a weighted sum of the usual cross-validation metric and an empirical model-stability measure. The weight on the stability term is itself chosen via a nested cross-validation procedure. This reduces the risk of strong validation set performance and poor test set performance due to instability. We benchmark our procedure on a suite of 13 real-world UCI datasets, and find that, compared to k-fold cross-validation over the same hyperparameters, it improves the out-of-sample MSE for sparse ridge regression and CART by 4% on average, but has no impact on XGBoost. This suggests that for interpretable and unstable models, such as sparse regression and CART, our approach is a viable and computationally affordable method for improving test-set performance.

Problem

Research questions and friction points this paper is trying to address.

Ensures strong test-set performance via cross-validation

Selects hyperparameters using validation metric and stability measure

Improves out-of-sample MSE for sparse ridge regression and CART

Innovation

Methods, ideas, or system contributions that make the work stand out.

Nested k-fold cross-validation for hyperparameter selection

Weighted sum of cross-validation and stability metric

Empirical model-stability measure reduces test-set risk

🔎 Similar Papers

Distributional bias compromises leave-one-out cross-validation