🤖 AI Summary
This paper addresses the limitations of conventional generalized additive models (GAMs), where covariate transformations typically rely on ad hoc preprocessing and hinder joint estimation and uncertainty quantification. We propose an end-to-end modeling framework that directly embeds differentiable parametric transformations into the GAM structure. Our method jointly optimizes transformation parameters, smooth function coefficients, and hyperparameters—enabling, for the first time, differentiable transformation and unified GAM fitting for high-dimensional, complex covariates. Within an empirical Bayes framework, we integrate maximum a posteriori estimation, Laplace approximation, and implicit differentiation to enable efficient inference and joint uncertainty quantification. Empirical evaluation on UK net electricity demand forecasting and London housing price modeling demonstrates superior flexibility and predictive performance. The proposed methodology has been implemented in the open-source R package *gamFactory*.
📝 Abstract
Transformations of covariates are widely used in applied statistics to improve interpretability and to satisfy assumptions required for valid inference. More broadly, feature engineering encompasses a wider set of practices aimed at enhancing predictive performance, and is typically performed as part of a data pre-processing step. In contrast, this paper integrates a substantial component of the feature engineering process directly into the modelling stage. This is achieved by introducing a novel general framework for embedding interpretable covariate transformations within multi-parameter Generalised Additive Models (GAMs). Our framework accommodates any sufficiently differentiable scalar-valued transformation of potentially high-dimensional and complex covariates. These transformations are treated as integral model components, with their parameters estimated jointly with regression coefficients via maximum a posteriori (MAP) methods, and joint uncertainty quantified via approximate Bayesian techniques. Smoothing parameters are selected in an empirical Bayes framework using a Laplace approximation to the marginal likelihood, supported by efficient computation based on implicit differentiation methods. We demonstrate the flexibility and practical value of the proposed methodology through applications to forecasting electricity net-demand in Great Britain and to modelling house prices in London. The proposed methods are implemented by the gamFactory R package, available at https://github.com/mfasiolo/gamFactory.