๐ค AI Summary
In multimodal single-cell analysis, a key challenge in disentangled representation learning is to jointly learn shared features that are mutually independent and informationally complete, alongside modality-specific features. To address this, we propose IndiSeek: a variational disentanglement framework grounded in the information bottleneck principle. IndiSeek introduces a differentiable reconstruction loss guided by a lower bound on conditional mutual information, enabling joint optimization of feature independence and intra-modality information completeness. By leveraging variational inference to approximate intractable mutual information terms, it avoids adversarial training or hard constraints, ensuring fully end-to-end differentiability. Evaluated on synthetic data, CITE-seq, and multiple real-world multimodal benchmarks, IndiSeek achieves superior feature disentanglement quality and establishes new state-of-the-art performance across downstream tasksโincluding cell type annotation, batch correction, and cross-modality imputation.
๐ Abstract
Learning disentangled representations is a fundamental task in multi-modal learning. In modern applications such as single-cell multi-omics, both shared and modality-specific features are critical for characterizing cell states and supporting downstream analyses. Ideally, modality-specific features should be independent of shared ones while also capturing all complementary information within each modality. This tradeoff is naturally expressed through information-theoretic criteria, but mutual-information-based objectives are difficult to estimate reliably, and their variational surrogates often underperform in practice. In this paper, we introduce IndiSeek, a novel disentangled representation learning approach that addresses this challenge by combining an independence-enforcing objective with a computationally efficient reconstruction loss that bounds conditional mutual information. This formulation explicitly balances independence and completeness, enabling principled extraction of modality-specific features. We demonstrate the effectiveness of IndiSeek on synthetic simulations, a CITE-seq dataset and multiple real-world multi-modal benchmarks.