🤖 AI Summary
This study addresses the issue of information loss inherent in dimensionality reduction techniques for multivariate data visualization by proposing a model-free approach that preserves the intrinsic structure of the original data. Building upon the Ball Mapper algorithm from topological data analysis (TDA), the method constructs a structure-preserving graph representation by covering the high-dimensional point cloud with balls of equal radius, enabling coloring based on auxiliary variables or residuals. The work innovatively integrates Ball Mapper into the Stata statistical software for the first time through the development of the ballmapper package, offering a model-agnostic tool for high-dimensional data visualization. Empirical demonstrations across diverse domains—including finance, economics, geography, medicine, and chemistry—highlight its broad applicability and potential impact.
📝 Abstract
Topological Data Analysis Ball Mapper (TDABM) offers a model-free visualization of multivariate data which does not necessitate the information loss associated with dimensionality reduction. TDABM Dlotko (2019) produces a cover of a multidimensional point cloud using equal size balls, the radius of the ball is the only parameter. A TDABM visualization retains the full structure of the data. The graphs produced by TDABM can convey coloration according to further variables, model residuals, or variables within the multivariate data. An expanding literature makes use of the power of TDABM across Finance, Economics, Geography, Medicine and Chemistry amongst others. We provide an introduction to TDABM and the \texttt{ballmapper} package for Stata.