VisAnatomy: An SVG Chart Corpus with Fine-Grained Semantic Labels

๐Ÿ“… 2024-10-16
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Existing chart corpora typically annotate only high-level categories (e.g., chart type), limiting AI-driven fine-grained visualization understanding. To address this, we introduce the first fine-grained semantic annotation corpus for SVG charts, comprising 942 real-world charts, 40 chart types, and over 383,000 graphical elements. Annotations span element type/role/position, grouping structure, layout, and visual encoding. We propose the first full-stack, anatomy-inspired semantic annotation paradigm for SVGโ€”integrating SVG parsing, multi-level semantic modeling, geometry- and DOM-based role inference, and accessibility navigation mapping. Experiments demonstrate significant improvements: a 12.6% gain in SVG shape recognition accuracy, an F1 score of 0.89 for chart semantic decomposition, 92.4% accuracy in chart classification, and empirical validation of an accessible navigation prototype. This work establishes a foundational resource and methodology for deep, interpretable, and inclusive SVG chart understanding.

Technology Category

Application Category

๐Ÿ“ Abstract
Chart corpora, which comprise data visualizations and their semantic labels, are crucial for advancing visualization research. However, the labels in most existing corpora are high-level (e.g., chart types), hindering their utility for broader applications in the era of AI. In this paper, we contribute VisAnatomy, a chart corpus containing 942 real-world SVG charts produced by over 50 tools, encompassing 40 chart types and featuring structural and stylistic design variations. The underlying data tables are also included if available. Each chart is augmented with multi-level fine-grained labels on its semantic components, including each graphical element's type, role, and position, hierarchical groupings of elements, group layouts, and visual encodings. In total, VisAnatomy provides labels for more than 383k graphical elements. We demonstrate the richness of the semantic labels by comparing VisAnatomy with existing corpora. We illustrate its usefulness through four applications: shape recognition for SVG elements, chart semantic decomposition, chart type classification, and content navigation for accessibility. Finally, we discuss our plan to improve VisAnatomy and research opportunities VisAnatomy presents.
Problem

Research questions and friction points this paper is trying to address.

Lack of fine-grained semantic labels in existing chart corpora
Need for diverse chart types and design variations
Enhancing AI applications with detailed visual element annotations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-grained semantic labels for SVG charts
Multi-level component annotations including element roles
Supports diverse applications like accessibility navigation
๐Ÿ”Ž Similar Papers
No similar papers found.
C
Chen Chen
Department of Computer Science, University of Maryland, College Park, USA
Hannah K. Bako
Hannah K. Bako
Department of Computer Science, University of Maryland, College Park, USA
Peihong Yu
Peihong Yu
University of Maryland, College Park
RoboticeReinforcement LearningSLAMComputer Vision
John Hooker
John Hooker
Department of Computer Science, University of Maryland, College Park, USA
J
Jeffrey Joyal
Department of Computer Science, University of Maryland, College Park, USA
S
Simon C. Wang
Department of Computer Science, University of Maryland, College Park, USA
S
Samuel Kim
Department of Computer Science, University of Maryland, College Park, USA
J
Jessica Wu
Department of Computer Science, University of Maryland, College Park, USA
A
Aoxue Ding
Department of Computer Science, University of Maryland, College Park, USA
L
Lara Sandeep
Department of Computer Science, University of Maryland, College Park, USA
Alex Chen
Alex Chen
Harvard University
Neuroscience
C
Chayanika Sinha
Department of Computer Science, University of Maryland, College Park, USA
Z
Zhicheng Liu
Department of Computer Science, University of Maryland, College Park, USA