🤖 AI Summary
Existing Voronoi treemaps effectively represent hierarchical structures but struggle to preserve data-driven node similarities—such as geographic proximity or semantic embedding similarity. This paper proposes a similarity-aware Voronoi treemap generation method that integrates similarity constraints into the layout process. Specifically, node similarities are encoded via the Kuhn–Munkres optimal assignment algorithm during preprocessing; region initialization leverages centroidal Voronoi tessellation (CVT); and adjacency is refined through greedy neighborhood swapping and iterative area adjustment. The method strictly maintains hierarchical nesting while significantly improving semantic neighborhood fidelity. Experiments on multiple real-world datasets show that our approach achieves an average 23.6% improvement in neighborhood preservation over baseline methods and yields a 19.4% gain in quantitative visualization quality scores. To the best of our knowledge, this is the first work to jointly model hierarchical structure and data similarity within the Voronoi treemap framework.
📝 Abstract
Voronoi treemaps are used to depict nodes and their hierarchical relationships simultaneously. However, in addition to the hierarchical structure, data attributes, such as co-occurring features or similarities, frequently exist. Examples include geographical attributes like shared borders between countries or contextualized semantic information such as embedding vectors derived from large language models. In this work, we introduce a Voronoi treemap algorithm that leverages data similarity to generate neighborhood-preserving treemaps. First, we extend the treemap layout pipeline to consider similarity during data preprocessing. We then use a Kuhn-Munkres matching of similarities to centroidal Voronoi tessellation (CVT) cells to create initial Voronoi diagrams with equal cell sizes for each level. Greedy swapping is used to improve the neighborhoods of cells to match the data's similarity further. During optimization, cell areas are iteratively adjusted to their respective sizes while preserving the existing neighborhoods. We demonstrate the practicality of our approach through multiple real-world examples drawn from infographics and linguistics. To quantitatively assess the resulting treemaps, we employ treemap metrics and measure neighborhood preservation.