Tree-based variational inference for Poisson log-normal models

📅 2024-06-25
📈 Citations: 0
✹ Influential: 0
📄 PDF
đŸ€– AI Summary
Existing Poisson-lognormal (PLN) models fail to capture hierarchical structures—such as microbial taxonomies, administrative geographies, or product categories—commonly present in count data, limiting their interpretability and generalizability in ecology, medicine, and related fields. To address this, we propose PLN-Tree: the first generative PLN model that explicitly incorporates a hierarchical tree prior to encode parent–child dependencies among entities. Methodologically, we design a structured variational inference algorithm and establish theoretical guarantees for parameter identifiability. The model supports hierarchy-aware inference and downstream classification tasks. Experiments on synthetic and real-world microbiome datasets demonstrate that PLN-Tree significantly improves accuracy in modeling hierarchical dependencies and enhances ecological interpretability. Our results underscore the critical role of domain-specific prior knowledge—e.g., taxonomic graphs—in modeling complex systems.

Technology Category

Application Category

📝 Abstract
When studying ecosystems, hierarchical trees are often used to organize entities based on proximity criteria, such as the taxonomy in microbiology, social classes in geography, or product types in retail businesses, offering valuable insights into entity relationships. Despite their significance, current count-data models do not leverage this structured information. In particular, the widely used Poisson log-normal (PLN) model, known for its ability to model interactions between entities from count data, lacks the possibility to incorporate such hierarchical tree structures, limiting its applicability in domains characterized by such complexities. To address this matter, we introduce the PLN-Tree model as an extension of the PLN model, specifically designed for modeling hierarchical count data. By integrating structured variational inference techniques, we propose an adapted training procedure and establish identifiability results, enhancing both theoretical foundations and practical interpretability. Additionally, we extend our framework to classification tasks as a preprocessing pipeline for compositional data, showcasing its versatility. Experimental evaluations on synthetic datasets as well as real-world microbiome data demonstrate the superior performance of the PLN-Tree model in capturing hierarchical dependencies and providing valuable insights into complex data structures, showing the practical interest of knowledge graphs like the taxonomy in ecosystems modeling.
Problem

Research questions and friction points this paper is trying to address.

Extends PLN model for hierarchical count data
Incorporates tree structures in Poisson log-normal models
Improves interpretability via structured variational inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends PLN model for hierarchical count data
Uses structured variational inference techniques
Improves interpretability with identifiable features
🔎 Similar Papers
No similar papers found.
A
Alexandre Chaussard
CNRS, Laboratoire de Probabilités, Statistique et Modélisation, LPSM, Sorbonne Université, F-75005 Paris, France
Anna Bonnet
Anna Bonnet
Sorbonne Université
E
Elisabeth Gassiat
CNRS, Laboratoire de MathĂ©matiques d’Orsay, LMO, UniversitĂ© Paris-Saclay, Orsay, France
Sylvain Le Corff
Sylvain Le Corff
LPSM, Sorbonne Université
Monte Carlo methodsMarkov ChainsComputational StatisticsNonparameteric Statistics