Adaptive partition Factor Analysis

๐Ÿ“… 2024-10-24
๐Ÿ“ˆ Citations: 2
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Traditional factor analysis struggles to distinguish latent factors shared across multiple studies from study- or subgroup-specific sources of variation. To address this, we propose an adaptive multi-study joint factor model that employs a novel hierarchical shrinkage prior to induce sparsity and structural adaptivity in factor loadings. This is the first Bayesian framework to rigorously ensure identifiability of multi-study factor loadings while enabling unbiased estimation of subgroup-specific factors. The method flexibly infers hierarchical factor structuresโ€”from globally shared factors, to cross-study subgroups, down to fine-grained factors nested within individual studies. Simulation studies demonstrate estimation accuracy comparable to state-of-the-art methods, with substantially improved interpretability. Applied to avian co-occurrence and ovarian cancer gene expression datasets, the model successfully identifies robust cross-cohort biological signals and subgroup-specific driver factors.

Technology Category

Application Category

๐Ÿ“ Abstract
Factor Analysis has traditionally been utilized across diverse disciplines to extrapolate latent traits that influence the behavior of multivariate observed variables. Historically, the focus has been on analyzing data from a single study, neglecting the potential study-specific variations present in data from multiple studies. Multi-study factor analysis has emerged as a recent methodological advancement that addresses this gap by distinguishing between latent traits shared across studies and study-specific components arising from artifactual or population-specific sources of variation. In this paper, we extend the current methodologies by introducing novel shrinkage priors for the latent factors, thereby accommodating a broader spectrum of scenarios -- from the absence of study-specific latent factors to models in which factors pertain only to small subgroups nested within or shared between the studies. For the proposed construction we provide conditions for identifiability of factor loadings and guidelines to perform straightforward posterior computation via Gibbs sampling. Through comprehensive simulation studies, we demonstrate that our proposed method exhibits competing performance across a variety of scenarios compared to existing methods, yet providing richer insights. The practical benefits of our approach are further illustrated through applications to bird species co-occurrence data and ovarian cancer gene expression data.
Problem

Research questions and friction points this paper is trying to address.

Extending factor analysis to handle multi-study data variations
Introducing shrinkage priors for flexible latent factor modeling
Improving identifiability and computation in multi-study factor analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces novel shrinkage priors for latent factors
Extends multi-study factor analysis methodology
Provides identifiability conditions and Gibbs sampling guidelines
๐Ÿ”Ž Similar Papers
No similar papers found.
E
Elena Bortolato
Department of Business and Economics, Universitat Pompeu Fabra, Data Science Center, Barcelona School of Economics
Antonio Canale
Antonio Canale
Associate professor, University of Padova
Bayesian nonparametricsFunctional Data AnalysisFlexible distributions