A discomfort-informed adaptive Gibbs sampler for finite mixture models

📅 2025-12-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional Gibbs sampling for Bayesian inference in finite mixture models suffers from low efficiency—particularly in high-dimensional electronic health record (EHR) clustering—due to redundant updates and slow convergence. To address this, we propose a discomfort-guided adaptive Gibbs sampling method, where “discomfort” quantifies the uncertainty of an observation’s current cluster assignment, enabling selective resampling only for low-confidence data points. This work is the first to integrate classification confidence modeling directly into the MCMC framework, dynamically adapting the update strategy via historical sampling trajectories. Experiments on both synthetic and real-world EHR datasets demonstrate that our method significantly accelerates posterior convergence (average speedup of 2.3×) and substantially reduces wasteful computation, outperforming state-of-the-art sampling techniques.

Technology Category

Application Category

📝 Abstract
Finite mixture models are frequently used to uncover latent structures in high-dimensional datasets (e.g. identifying clusters of patients in electronic health records). The inference of such structures can be performed in a Bayesian framework, and involves the use of sampling algorithms such as Gibbs samplers aimed at deriving posterior distribution of the probabilities of observations to belong to specific clusters. Unfortunately, traditional implementations of Gibbs samplers in this context often face critical challenges, such as inefficient use of computational resources and unnecessary updates for observations that are highly likely to remain in their current cluster. This paper introduces a new adaptive Gibbs sampler that improves the convergence efficiency over existing methods. In particular, our sampler is guided by a function that, at each iteration, uses the past of the chain to focus the updating on observations potentially misclassified in the current clustering, i.e. those with a low probability of belonging to their current component. Through simulation studies and two real data analyses, we empirically demonstrate that, in terms of convergence time, our method tends to perform more efficiently compared to state-of-the-art approaches.
Problem

Research questions and friction points this paper is trying to address.

Improves convergence efficiency of Gibbs samplers for finite mixture models
Focuses updates on potentially misclassified observations to reduce computational waste
Addresses inefficient resource use in Bayesian clustering inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Gibbs sampler using discomfort function
Focuses updates on potentially misclassified observations
Improves convergence efficiency in mixture models
🔎 Similar Papers
No similar papers found.
D
Davide Fabbrico
University of Florence, Department of Statistics, IT
A
Andi Q. Wang
University of Warwick, Department of Statistics, UK
S
Sebastiano Grazzi
University of Warwick, Department of Statistics, UK
A
Alice Corbella
University of Warwick, Department of Statistics, UK
G
Gareth O. Roberts
University of Warwick, Department of Statistics, UK
Sylvia Richardson
Sylvia Richardson
Director of MRC Biostatistics Unit and Professor of Biostatistics, University of Cambridge
Statistical genomicshigh-dimensional databioinformatics
Filippo Pagani
Filippo Pagani
MRC Biostatistics Unit, University of Cambridge, UK
P
Paul D. W. Kirk
MRC Biostatistics Unit, University of Cambridge, UK