🤖 AI Summary
To address information loss and dimensionality constraints in visualizing high-dimensional multivariate data, this paper proposes a model-agnostic, high-fidelity topological visualization method. It constructs a low-dimensional topological graph via Ball Mapper, fully preserving pairwise distances among original data points and enabling joint structural representation of arbitrarily many variables. We present the first systematic implementation and open-source release of a Python-based Ball Mapper—TDABM—a toolchain integrating ordinal encoding, customizable coloring schemes, and interactive graph visualization. Auxiliary variables—including model predictions, residuals, or ground-truth labels—can be mapped onto the topological structure to expose distributional patterns and outcome associations. Unlike conventional scatter plots, our approach overcomes the inherent dimensionality bottleneck, substantially enhancing efficiency in high-dimensional data exploration, statistical model diagnostics, and interpretability assessment.
📝 Abstract
Visualization of data is an important step in the understanding of data and the evaluation of statistical models. Topological Data Analysis Ball Mapper (TDABM) after Dlotko (2019), provides a model free means to visualize multivariate datasets without information loss. To permit the construction of a TDABM graph, each variable must be ordinal and have sufficiently many values to make a scatterplot of interest. Where a scatterplot works with two, or three, axes, the TDABM graph can handle any number of axes simultaneously. The result is a visualization of the structure of data. The TDABM graph also permits coloration by additional variables, enabling the mapping of outcomes across the joint distribution of axes. The strengths of TDABM for understanding data, and evaluating models, lie behind a rapidly expanding literature. This guide provides an introduction to TDABM with code in Python.