Rectified Fisher-Bingham Model for Compositional Data with Zeros

📅 2026-04-27
📈 Citations: 0
Influential: 0
📄 PDF

career value

192K/year
🤖 AI Summary
Compositional data such as microbiome profiles often contain excessive zeros, rendering conventional modeling approaches ineffective. This work proposes a unified probabilistic modeling framework that maps the data to the positive orthant of the unit hypersphere via the isometric log-ratio transformation with square-root scaling. By integrating a latent-variable formulation of the Fisher–Bingham distribution with a deterministic transformation, the method directly generates exact zeros without requiring imputation or separate zero-inflated components. It thus enables full likelihood-based inference for zero-containing compositional data—a capability not previously achieved—and facilitates structured differential abundance testing grounded in a parametric model. Simulations demonstrate substantially improved statistical power under high zero proportions, and application to a dietary intervention study successfully uncovers microbial community shifts missed by standard methods.
📝 Abstract
This paper introduces a rectified and renormalized Fisher-Bingham model for compositional data with zeros, motivated in part by the presence of zeros in microbiota studies. The approach represents compositions through a square-root transformation that maps data to the positive orthant of the unit sphere, and models them via a latent Fisher-Bingham followed by a deterministic transformation that induces exact zeros. This construction yields a coherent likelihood without requiring zero imputation or separate modeling of zero and nonzero components. Parameter estimation is performed using a Monte Carlo expectation-maximization algorithm that accommodates the latent structure. We further develop a score test for detecting structured differences in composition across groups, providing a parametric alternative to commonly used distance-based methods. Simulation studies demonstrate that the proposed method closely approximates the induced distribution and achieves higher power for detecting structured compositional changes, particularly when observations include many zero-valued components. An application to a dietary intervention study illustrates that the method identifies meaningful microbiota shifts not detected by standard approaches.
Problem

Research questions and friction points this paper is trying to address.

compositional data
zeros
Fisher-Bingham model
microbiota studies
likelihood modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fisher-Bingham distribution
compositional data with zeros
square-root transformation
Monte Carlo EM algorithm
score test
🔎 Similar Papers
2024-10-08arXiv.orgCitations: 0
E
Eugene Han
Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, IL.
M
Marahi Perez-Tamayo
Division of Nutritional Sciences, University of Illinois at Urbana-Champaign, Champaign, IL.
H
Hannah D. Holscher
Division of Nutritional Sciences, University of Illinois at Urbana-Champaign, Champaign, IL.; Department of Food Science and Human Nutrition, University of Illinois at Urbana-Champaign, Champaign, IL.
Ruoqing Zhu
Ruoqing Zhu
University of Illinois Urbana-Champaign
Personalized MedicineReinforcement LearningRandom ForestsSurvival AnalysisDimension Reduction