🤖 AI Summary
To address the challenge of quantifying uncertainty in individual driver claim costs for motor insurance pricing, this paper proposes a distribution-free prediction interval construction method. Methodologically, it integrates a regularized Tweedie generalized linear model with LightGBM and introduces a novel nonconformity measure based on locally weighted Pearson residuals, combined with isotonic regression-based conformal prediction (COP) for robust interval estimation. The key contributions are: (i) the first adaptation of COP to the Tweedie loss framework, eliminating strong distributional assumptions on residuals; and (ii) the proposed nonconformity measure substantially improves both coverage probability and interval sharpness. Empirical evaluation on real-world insurance data demonstrates that the method achieves the narrowest average prediction interval width while strictly maintaining the nominal coverage level—outperforming existing benchmark approaches significantly.
📝 Abstract
Prediction uncertainty quantification is a key research topic in recent years scientific and business problems. In insurance industries (cite{parodi2023pricing}), assessing the range of possible claim costs for individual drivers improves premium pricing accuracy. It also enables insurers to manage risk more effectively by accounting for uncertainty in accident likelihood and severity. In the presence of covariates, a variety of regression-type models are often used for modeling insurance claims, ranging from relatively simple generalized linear models (GLMs) to regularized GLMs to gradient boosting models (GBMs). Conformal predictive inference has arisen as a popular distribution-free approach for quantifying predictive uncertainty under relatively weak assumptions of exchangeability, and has been well studied under the classic linear regression setting. In this work, we propose new non-conformity measures for GLMs and GBMs with GLM-type loss. Using regularized Tweedie GLM regression and LightGBM with Tweedie loss, we demonstrate conformal prediction performance with these non-conformity measures in insurance claims data. Our simulation results favor the use of locally weighted Pearson residuals for LightGBM over other methods considered, as the resulting intervals maintained the nominal coverage with the smallest average width.