π€ AI Summary
This work addresses the dual incompleteness problem in multi-view multi-label learning, where both views and labels may be simultaneously missingβa scenario in which existing methods struggle to learn stable and discriminative shared representations due to the lack of explicit structural constraints. To tackle this challenge, the authors propose a structured consistent representation learning framework that learns discrete consensus representations through a shared multi-view codebook and cross-view reconstruction. At the decision level, a label-correlation-aware view-weighted fusion mechanism is introduced to leverage structural dependencies among labels. Furthermore, a teacher-guided self-distillation architecture is designed to distill global knowledge back into individual view-specific branches, thereby enhancing generalization. Extensive experiments on five benchmark datasets demonstrate that the proposed method significantly outperforms state-of-the-art approaches, confirming its effectiveness and robustness under dual missingness conditions.
π Abstract
Although multi-view multi-label learning has been extensively studied, research on the dual-missing scenario, where both views and labels are incomplete, remains largely unexplored. Existing methods mainly rely on contrastive learning or information bottleneck theory to learn consistent representations under missing-view conditions, but loss-based alignment without explicit structural constraints limits the ability to capture stable and discriminative shared semantics. To address this issue, we introduce a more structured mechanism for consistent representation learning: we learn discrete consistent representations through a multi-view shared codebook and cross-view reconstruction, which naturally align different views within the limited shared codebook embeddings and reduce feature redundancy. At the decision level, we design a weight estimation method that evaluates the ability of each view to preserve label correlation structures, assigning weights accordingly to enhance the quality of the fused prediction. In addition, we introduce a fused-teacher self-distillation framework, where the fused prediction guides the training of view-specific classifiers and feeds the global knowledge back into the single-view branches, thereby enhancing the generalization ability of the model under missing-label conditions. The effectiveness of our proposed method is thoroughly demonstrated through extensive comparative experiments with advanced methods on five benchmark datasets. Code is available at https://github.com/xuy11/SCSD.