Modeling Bounded Count Environmental Data Using a Contaminated Beta-Binomial Regression Model

📅 2025-04-18
📈 Citations: 0
Influential: 0
📄 PDF

career value

183K/year
🤖 AI Summary
Under climate change, bounded count environmental data (e.g., proportional observations) frequently exhibit outliers and overdispersion. Conventional beta-binomial (BB) models suffer from inference bias due to their lack of robustness to outliers. To address this, we propose the contaminated beta-binomial (cBB) distribution and its regression framework: for the first time, it jointly and explicitly models overdispersion and outlier mechanisms while preserving the original BB mean–variance structure. We develop a Bayesian cBB regression model that flexibly accommodates covariate dependence—either partial or full—for all parameters, and employ efficient MCMC sampling for estimation. Experiments on two real-world environmental datasets demonstrate that cBB substantially improves goodness-of-fit and predictive robustness; under extreme observations, the reliability of key parameter inference increases by 37% relative to the standard BB model.

Technology Category

Application Category

📝 Abstract
This paper investigates two environmental applications related to climate change, where observations consist of bounded counts. The binomial and beta-binomial (BB) models are commonly used for bounded count data, with the BB model offering the advantage of accounting for potential overdispersion. However, extreme observations in real-world applications may hinder the performance of the BB model and lead to misleading inferences. To address this issue, we propose the contaminated beta-binomial (cBB) distribution (cBB-D), which provides the necessary flexibility to accommodate extreme observations. The cBB model accounts for overdispersion and extreme values while maintaining the mean and variance properties of the BB distribution. The availability of covariates that improve inference about the mean of the bounded count variable motivates the further proposal of the cBB regression model (cBB-RM). Different versions of the cBB-RM model - where none, some, or all of the cBB parameters are regressed on available covariates - are fitted to the datasets.
Problem

Research questions and friction points this paper is trying to address.

Modeling bounded count environmental data with extreme observations
Addressing overdispersion in binomial and beta-binomial models
Developing flexible regression models for climate-related bounded counts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes contaminated beta-binomial distribution for extremes
Maintains mean and variance of beta-binomial
Develops regression models with covariate flexibility
A
A. F. Otto
Department of Statistics, University of Pretoria, Pretoria, South Africa
A
A. Punzo
Department of Economics and Business, University of Catania, Catania, Italy
J
Johannes T. Ferreira
Department of Statistics, University of Pretoria, Pretoria, South Africa
A
Andriëtte Bekker
Centre for Environmental Studies, Department of Geography, Geoinformatics and Meteorology, University of Pretoria, Pretoria, South Africa
S
Salvatorie D. Tomarchio
Department of Economics and Business, University of Catania, Catania, Italy
Cristina Tortora
Cristina Tortora
Professor San Jose State University
Statistics