🤖 AI Summary
This paper addresses the challenge of balancing modeling flexibility and parameter interpretability in conditional distribution regression. We propose the Bayesian Penalized Transformation Model (BPTM), which maps the response variable to a standard reference distribution via a monotone-increasing spline-based transformation function, jointly modeling location, scale, and shape parameters. A key innovation is the incorporation of a smooth Gaussian process prior directly into the transformation function, thereby unifying conditional transformation models and generalized additive distribution models within a single coherent framework. Full Bayesian inference—implemented via the No-U-Turn Sampler (NUTS)—supports structured additive predictors, enabling flexible distributional estimation and natural uncertainty quantification for covariate effects. The method demonstrates robustness and flexibility in extensive simulations and real-data applications, including the Dutch Fourth Growth Study and the Framingham Heart Study. An open-source Python library is provided to facilitate end-to-end distributional regression modeling.
📝 Abstract
Penalized transformation models (PTMs) are a novel form of location-scale regression. In PTMs, the shape of the response's conditional distribution is estimated directly from the data, and structured additive predictors are placed on its location and scale. The core of the model is a monotonically increasing transformation function that relates the response distribution to a reference distribution. The transformation function is equipped with a smoothness prior that regularizes how much the estimated distribution diverges from the reference distribution. These models can be seen as a bridge between conditional transformation models and generalized additive models for location, scale and shape. Markov chain Monte Carlo inference for PTMs can be conducted with the No-U-Turn sampler and offers straightforward uncertainty quantification for the conditional distribution as well as for the covariate effects. A simulation study demonstrates the effectiveness of the approach. We apply the model to data from the Fourth Dutch Growth Study and the Framingham Heart Study. A full-featured implementation is available as a Python library.