Distribution-free inference for LightGBM and GLM with Tweedie loss

📅 2025-07-09

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

To address the challenge of quantifying uncertainty in individual driver claim costs for motor insurance pricing, this paper proposes a distribution-free prediction interval construction method. Methodologically, it integrates a regularized Tweedie generalized linear model with LightGBM and introduces a novel nonconformity measure based on locally weighted Pearson residuals, combined with isotonic regression-based conformal prediction (COP) for robust interval estimation. The key contributions are: (i) the first adaptation of COP to the Tweedie loss framework, eliminating strong distributional assumptions on residuals; and (ii) the proposed nonconformity measure substantially improves both coverage probability and interval sharpness. Empirical evaluation on real-world insurance data demonstrates that the method achieves the narrowest average prediction interval width while strictly maintaining the nominal coverage level—outperforming existing benchmark approaches significantly.

Technology Category

Application Category

📝 Abstract

Prediction uncertainty quantification is a key research topic in recent years scientific and business problems. In insurance industries (cite{parodi2023pricing}), assessing the range of possible claim costs for individual drivers improves premium pricing accuracy. It also enables insurers to manage risk more effectively by accounting for uncertainty in accident likelihood and severity. In the presence of covariates, a variety of regression-type models are often used for modeling insurance claims, ranging from relatively simple generalized linear models (GLMs) to regularized GLMs to gradient boosting models (GBMs). Conformal predictive inference has arisen as a popular distribution-free approach for quantifying predictive uncertainty under relatively weak assumptions of exchangeability, and has been well studied under the classic linear regression setting. In this work, we propose new non-conformity measures for GLMs and GBMs with GLM-type loss. Using regularized Tweedie GLM regression and LightGBM with Tweedie loss, we demonstrate conformal prediction performance with these non-conformity measures in insurance claims data. Our simulation results favor the use of locally weighted Pearson residuals for LightGBM over other methods considered, as the resulting intervals maintained the nominal coverage with the smallest average width.

Problem

Research questions and friction points this paper is trying to address.

Quantify prediction uncertainty for insurance claims

Develop non-conformity measures for GLMs and GBMs

Improve conformal prediction accuracy and interval width

Innovation

Methods, ideas, or system contributions that make the work stand out.

Conformal predictive inference for uncertainty quantification

New non-conformity measures for GLMs and GBMs

Locally weighted Pearson residuals optimize LightGBM performance

🔎 Similar Papers

DistPred: A Distribution-Free Probabilistic Inference Method for Regression and Forecasting