🤖 AI Summary
This work addresses the challenges of feature incompleteness and partial label annotation commonly encountered in multi-view multi-label learning, which significantly hinder model performance. To tackle these issues, the authors propose an adaptive decoupled representation learning framework that achieves robust view completion through neighborhood-aware cross-modal feature propagation and random masking-based reconstruction. The method explicitly models dependencies among label distributions to refine predictions, while incorporating mutual information constraints to enhance consistency of shared representations and suppress view redundancy. Furthermore, it integrates a prototype-guided pseudo-label space with view-specific feature selection to enable discriminative view fusion. Extensive experiments on multiple public benchmarks and real-world scenarios demonstrate that the proposed approach substantially outperforms existing state-of-the-art methods, confirming its effectiveness and superiority in handling incomplete multi-view multi-label data.
📝 Abstract
Multi-view multi-label learning frequently suffers from simultaneous feature absence and incomplete annotations, due to challenges in data acquisition and cost-intensive supervision. To tackle the complex yet highly practical problem while overcoming the existing limitations of feature recovery, representation disentanglement, and label semantics modeling, we propose an Adaptive Disentangled Representation Learning method (ADRL). ADRL achieves robust view completion by propagating feature-level affinity across modalities with neighborhood awareness, and reinforces reconstruction effectiveness by leveraging a stochastic masking strategy. Through disseminating category-level association across label distributions, ADRL refines distribution parameters for capturing interdependent label prototypes. Besides, we formulate a mutual-information-based objective to promote consistency among shared representations and suppress information overlap between view-specific representation and other modalities. Theoretically, we derive the tractable bounds to train the dual-channel network. Moreover, ADRL performs prototype-specific feature selection by enabling independent interactions between label embeddings and view representations, accompanied by the generation of pseudo-labels for each category. The structural characteristics of the pseudo-label space are then exploited to guide a discriminative trade-off during view fusion. Finally, extensive experiments on public datasets and real-world applications demonstrate the superior performance of ADRL.