🤖 AI Summary
Under climate change, bounded count environmental data (e.g., proportional observations) frequently exhibit outliers and overdispersion. Conventional beta-binomial (BB) models suffer from inference bias due to their lack of robustness to outliers. To address this, we propose the contaminated beta-binomial (cBB) distribution and its regression framework: for the first time, it jointly and explicitly models overdispersion and outlier mechanisms while preserving the original BB mean–variance structure. We develop a Bayesian cBB regression model that flexibly accommodates covariate dependence—either partial or full—for all parameters, and employ efficient MCMC sampling for estimation. Experiments on two real-world environmental datasets demonstrate that cBB substantially improves goodness-of-fit and predictive robustness; under extreme observations, the reliability of key parameter inference increases by 37% relative to the standard BB model.
📝 Abstract
This paper investigates two environmental applications related to climate change, where observations consist of bounded counts. The binomial and beta-binomial (BB) models are commonly used for bounded count data, with the BB model offering the advantage of accounting for potential overdispersion. However, extreme observations in real-world applications may hinder the performance of the BB model and lead to misleading inferences. To address this issue, we propose the contaminated beta-binomial (cBB) distribution (cBB-D), which provides the necessary flexibility to accommodate extreme observations. The cBB model accounts for overdispersion and extreme values while maintaining the mean and variance properties of the BB distribution. The availability of covariates that improve inference about the mean of the bounded count variable motivates the further proposal of the cBB regression model (cBB-RM). Different versions of the cBB-RM model - where none, some, or all of the cBB parameters are regressed on available covariates - are fitted to the datasets.