Bayesian analysis of product feature allocation models

📅 2024-08-28
🏛️ Journal of the Royal Statistical Society Series B: Statistical Methodology
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses predictive distribution, posterior inference, and α-diversity quantification for feature allocation models with product-form structure—exemplified by the Indian Buffet Process (IBP). Method: We develop a unified Bayesian nonparametric framework grounded in Gibbs-type priors, stochastic process posterior analysis, and species sampling theory. We derive closed-form predictive and posterior distributions for the entire class of product-form models; obtain exact finite-sample distributions for total feature count and number of unseen features; and propose a novel Beta-Bernoulli model that hybridizes IBP with finite random features. Results: Applied to sparse binary ecological datasets—Danish forest vegetation and Barro Colorado Island (Panama) tree census—the framework significantly improves species richness estimation accuracy, empirically validating both theoretical advances and practical utility.

Technology Category

Application Category

📝 Abstract
Feature allocation models are an extension of Bayesian nonparametric clustering models, where individuals can share multiple features. We study a broad class of models whose probability distribution has a product form, which includes the popular Indian buffet process. This class plays a prominent role among existing priors, and it shares structural characteristics with Gibbs-type priors in the species sampling framework. We develop a general theory for the entire class, obtaining closed form expressions for the predictive structure and the posterior law of the underlying stochastic process. Additionally, we describe the distribution for the number of features and the number of hitherto unseen features in a future sample, leading to the α-diversity for feature models. We also examine notable novel examples, such as mixtures of Indian buffet processes and beta Bernoulli models, where the latter entails a finite random number of features. This methodology finds significant applications in ecology, allowing the estimation of species richness for incidence data, as we demonstrate by analyzing plant diversity in Danish forests and trees in Barro Colorado Island.
Problem

Research questions and friction points this paper is trying to address.

Developing a general theory for Bayesian product feature allocation models
Deriving predictive structure and posterior law for feature-sharing processes
Applying feature models to estimate species richness in ecological datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed Bayesian product feature allocation models theory
Derived closed form expressions for predictive structure
Applied methodology to estimate species richness in ecology
🔎 Similar Papers
No similar papers found.
L
Lorenzo Ghilotti
Department of Economics, Management, and Statistics, University of Milano–Bicocca, 20126 Milano, Italy
Federico Camerlenghi
Federico Camerlenghi
Professor of Statistics, University of Milano - Bicocca
Bayesian nonparametricsspecies sampling modelscompletely random measuresexchangeability
T
T. Rigon
Department of Economics, Management, and Statistics, University of Milano–Bicocca, 20126 Milano, Italy