Bayesian nonparametric modeling of multivariate count data with an unknown number of traits

📅 2025-10-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Modeling multivariate count data with unknown latent feature dimensionality and heterogeneous yet non-independent group structures remains challenging. Method: This paper proposes a Bayesian nonparametric framework based on completely random measures (CRMs), relaxing the standard full exchangeability assumption in favor of partial exchangeability. It treats the number of latent features as a random variable and employs a novel mixture model to adaptively discover group structure, thereby avoiding over-clustering induced by fixed-dimensional assumptions. Contribution/Results: Theoretically, closed-form expressions for marginal and posterior distributions are derived, accommodating both binary and Poisson observations. Algorithmically, the method ensures interpretability and computational feasibility. Empirically, applied to the “Ndrangheta crime network,” it successfully uncovers hidden organizational subgroups and functional divisions, demonstrating robust modeling capacity and inferential validity for complex, heterogeneous count data.

Technology Category

Application Category

📝 Abstract
Feature and trait allocation models are fundamental objects in Bayesian nonparametrics and play a prominent role in several applications. Existing approaches, however, typically assume full exchangeability of the data, which may be restrictive in settings characterized by heterogeneous but related groups. In this paper, we introduce a general and tractable class of Bayesian nonparametric priors for partially exchangeable trait allocation models, relying on completely random vectors. We provide a comprehensive theoretical analysis, including closed-form expressions for marginal and posterior distributions, and illustrate the tractability of our framework in the cases of binary and Poisson-distributed traits. A distinctive aspect of our approach is that the number of traits is a random quantity, thereby allowing us to model and estimate unobserved traits. Building on these results, we also develop a novel mixture model that infers the group partition structure from the data, effectively clustering trait allocations. This extension generalizes Bayesian nonparametric latent class models and avoids the systematic overclustering that arises when the number of traits is assumed to be fixed. We demonstrate the practical usefulness of our methodology through an application to the `Ndrangheta criminal network from the Operazione Infinito investigation, where our model provides insights into the organization of illicit activities.
Problem

Research questions and friction points this paper is trying to address.

Modeling multivariate count data with unknown trait count using Bayesian nonparametrics
Developing partially exchangeable trait allocation models for heterogeneous groups
Inferring group partitions and clustering traits to avoid overclustering
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian nonparametric priors for partially exchangeable models
Random number of traits allowing unobserved trait modeling
Mixture model inferring group partition structure from data
🔎 Similar Papers
No similar papers found.
L
Lorenzo Ghilotti
Department of Economics, Management, and Statistics, University of Milano–Bicocca, 20126 Milano, Italy
Federico Camerlenghi
Federico Camerlenghi
Professor of Statistics, University of Milano - Bicocca
Bayesian nonparametricsspecies sampling modelscompletely random measuresexchangeability
T
Tommaso Rigon
Department of Economics, Management, and Statistics, University of Milano–Bicocca, 20126 Milano, Italy
Michele Guindani
Michele Guindani
Department of Biostatistics, University of California, Los Angeles
Bayesian AnalysisBayesian NonparametricsNeuroimagingImaging GeneticsStatistical decision making