🤖 AI Summary
High-dimensional data visualization often suffers from instability due to sensitivity to the choice of dimensionality reduction (DR) methods and hyperparameters, undermining robustness and reproducibility. To address this, we propose the first consensus DR framework based on multi-view learning, which fuses multiple DR outputs to automatically extract a shared low-dimensional structure invariant to both method selection and hyperparameter configuration. Our approach innovatively integrates subspace alignment, consensus matrix construction, and joint embedding optimization to achieve stable, cross-method and cross-parameter visualizations. Extensive experiments on synthetic and real-world datasets demonstrate that our method significantly enhances visualization robustness: it preserves structural consistency under DR method switching and hyperparameter perturbations. As a result, it delivers more reliable and reproducible low-dimensional representations for exploratory high-dimensional data analysis.
📝 Abstract
A plethora of dimension reduction methods have been developed to visualize high-dimensional data in low dimensions. However, different dimension reduction methods often output different and possibly conflicting visualizations of the same data. This problem is further exacerbated by the choice of hyperparameters, which may substantially impact the resulting visualization. To obtain a more robust and trustworthy dimension reduction output, we advocate for a consensus approach, which summarizes multiple visualizations into a single consensus dimension reduction visualization. Here, we leverage ideas from multi-view learning in order to identify the patterns that are most stable or shared across the many different dimension reduction visualizations, or views, and subsequently visualize this shared structure in a single low-dimensional plot. We demonstrate that this consensus visualization effectively identifies and preserves the shared low-dimensional data structure through both simulated and real-world case studies. We further highlight our method's robustness to the choice of dimension reduction method and hyperparameters -- a highly-desirable property when working towards trustworthy and reproducible data science.