🤖 AI Summary
Existing multi-view learning methods rely on the “multi-view redundancy assumption,” leveraging only shared information across views while neglecting the discriminative value of view-specific information.
Method: This paper proposes CIML, the first framework to explicitly discard this assumption. Grounded in information theory, CIML jointly models both shared and unique information: it employs Gacs–Körner common information to capture strictly shared structure, and integrates the information bottleneck principle with mutual information minimization to explicitly separate and compress shared and view-specific representations.
Contribution/Results: We theoretically prove that the learned joint representation is predictively sufficient for downstream tasks. Extensive experiments on multiple benchmarks demonstrate significant improvements over state-of-the-art methods, validating that representations jointly preserving shared and unique information are more complete and discriminative.
📝 Abstract
Information theory has inspired numerous advancements in multi-view learning. Most multi-view methods incorporating information-theoretic principles rely an assumption called multi-view redundancy which states that common information between views is necessary and sufficient for down-stream tasks. This assumption emphasizes the importance of common information for prediction, but inherently ignores the potential of unique information in each view that could be predictive to the task. In this paper, we propose a comprehensive information-theoretic multi-view learning framework named CIML, which discards the assumption of multi-view redundancy. Specifically, CIML considers the potential predictive capabilities of both common and unique information based on information theory. First, the common representation learning maximizes Gacs-Korner common information to extract shared features and then compresses this information to learn task-relevant representations based on the Information Bottleneck (IB). For unique representation learning, IB is employed to achieve the most compressed unique representation for each view while simultaneously minimizing the mutual information between unique and common representations, as well as among different unique representations. Importantly, we theoretically prove that the learned joint representation is predictively sufficient for the downstream task. Extensive experimental results have demonstrated the superiority of our model over several state-of-art methods. The code is released on CIML.