🤖 AI Summary
Existing disentanglement definitions and metrics assume mutual independence among latent factors, failing to capture inherent statistical dependencies among real-world factors—leading to poor generalization in practical scenarios.
Method: We propose the first information-theoretic, generalized disentanglement definition that explicitly accommodates non-independent factors and establish its theoretical connection to the information bottleneck principle. Building upon this, we design the first computable, robust disentanglement metric for non-independent factors—the Generalized Disentanglement Score (G-Disentanglement Score)—integrating mutual information, conditional mutual information, and statistical dependence modeling.
Results: Evaluated on controlled synthetic experiments and a unified benchmark, our metric consistently outperforms existing measures across multiple non-independent factor settings, achieving an average improvement of 23.6%. It exhibits strong theoretical grounding and empirical consistency, providing a principled, generalizable evaluation standard for representation learning in realistic settings.
📝 Abstract
Representation learning is an approach that allows to discover and extract the factors of variation from the data. Intuitively, a representation is said to be disentangled if it separates the different factors of variation in a way that is understandable to humans. Definitions of disentanglement and metrics to measure it usually assume that the factors of variation are independent of each other. However, this is generally false in the real world, which limits the use of these definitions and metrics to very specific and unrealistic scenarios. In this paper we give a definition of disentanglement based on information theory that is also valid when the factors of variation are not independent. Furthermore, we relate this definition to the Information Bottleneck Method. Finally, we propose a method to measure the degree of disentanglement from the given definition that works when the factors of variation are not independent. We show through different experiments that the method proposed in this paper correctly measures disentanglement with non-independent factors of variation, while other methods fail in this scenario.