Stochastic EM Estimation and Inference for Zero-Inflated Beta-Binomial Mixed Models for Longitudinal Count Data

📅 2026-02-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of modeling overdispersed, zero-inflated count data with longitudinal structure by proposing a zero-inflated beta-binomial mixed-effects model. The approach uniquely integrates a zero-inflation mechanism with the beta-binomial distribution and incorporates subject-specific random effects to account for within-subject correlations over time. To handle the intractable likelihood, the authors employ a latent variable augmentation strategy combined with a stochastic approximation EM (SAEM) algorithm for efficient parameter estimation. Both theoretical analysis and simulation studies demonstrate that the proposed method substantially outperforms conventional zero-inflated models, particularly in small-sample settings. When applied to longitudinal microbiome data in conjunction with the ZIBR model, it enhances inferential robustness, underscoring its methodological novelty and practical utility.

Technology Category

Application Category

📝 Abstract
Analyzing overdispersed, zero-inflated, longitudinal count data poses significant modeling and computational challenges, which standard count models (e.g., Poisson or negative binomial mixed effects models) fail to adequately address. We propose a Zero-Inflated Beta-Binomial Mixed Effects Regression (ZIBBMR) model that augments a beta-binomial count model with a zero-inflation component, fixed effects for covariates, and subject-specific random effects, accommodating excessive zeros, overdispersion, and within-subject correlation. Maximum likelihood estimation is performed via a Stochastic Approximation EM (SAEM) algorithm with latent variable augmentation, which circumvents the model's intractable likelihood and enables efficient computation. Simulation studies show that ZIBBMR achieves accuracy comparable to leading mixed-model approaches in the literature and surpasses simpler zero-inflated count formulations, particularly in small-sample scenarios. As a case study, we analyze longitudinal microbiome data, comparing ZIBBMR with an external Zero-Inflated Beta Regression (ZIBR) benchmark; the results indicate that applying both count- and proportion-based models in parallel can enhance inference robustness when both data types are available.
Problem

Research questions and friction points this paper is trying to address.

zero-inflated
overdispersed
longitudinal count data
mixed models
beta-binomial
Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-Inflated Beta-Binomial
Mixed Effects Model
Stochastic Approximation EM
Longitudinal Count Data
Overdispersion
J
John Barrera
Instituto de Ingeniería Matemática, Facultad de Ingeniería, Universidad de Valparaíso, Valparaíso, Chile
A
Ana Arribas-Gil
Departamento de Estadística, Universidad Carlos III de Madrid, Getafe, Spain
Dae-Jin Lee
Dae-Jin Lee
School of Science & Technology, IE University
Data ScienceStatistical ModellingSports AnalyticsBiostatisticsSemiparametric regression
C
Cristian Meza
CIMFAV, Universidad de Valparaíso, Valparaíso, Chile