🤖 AI Summary
Multi-view learning confronts challenges of prediction inconsistency and uncertainty arising from inter-view discrepancies. To address this, we propose a hierarchical mutual distillation framework that, for the first time, enables cross-view consistency modeling across three distinct input configurations: single-view, partial multi-view, and full multi-view. We introduce an uncertainty-weighted mutual distillation mechanism that dynamically suppresses interference from unreliable predictions. Furthermore, we integrate a CNN-Transformer hybrid architecture with multi-granularity knowledge transfer to adaptively fuse view-specific features. Evaluated on large-scale, unstructured datasets with non-fixed view availability, our method significantly improves both predictive accuracy and cross-view consistency. It outperforms state-of-the-art multi-view learning approaches on multiple standard benchmarks.
📝 Abstract
Multi-view learning often faces challenges in effectively leveraging images captured from different angles and locations. This challenge is particularly pronounced when addressing inconsistencies and uncertainties between views. In this paper, we propose a novel Multi-View Uncertainty-Weighted Mutual Distillation (MV-UWMD) method. Our method enhances prediction consistency by performing hierarchical mutual distillation across all possible view combinations, including single-view, partial multi-view, and full multi-view predictions. This introduces an uncertainty-based weighting mechanism through mutual distillation, allowing effective exploitation of unique information from each view while mitigating the impact of uncertain predictions. We extend a CNN-Transformer hybrid architecture to facilitate robust feature learning and integration across multiple view combinations. We conducted extensive experiments using a large, unstructured dataset captured from diverse, non-fixed viewpoints. The results demonstrate that MV-UWMD improves prediction accuracy and consistency compared to existing multi-view learning approaches.