🤖 AI Summary
This study addresses the accessibility bottleneck in Jupyter notebooks caused by the absence of high-quality alternative text (alt text) for Matplotlib visualizations. We propose a hybrid generation method that synergistically combines heuristic rules with a large language model (GPT-4-turbo). By parsing Matplotlib’s backend rendering objects, we extract structured tabular data and visual features—such as chart type, axes, labels, and subplot layout—and inject them into LLM prompts, significantly improving factual accuracy and coverage across diverse chart types. The approach supports single-variable plots as well as complex multi-subplot and multi-axis visualizations. We deliver a lightweight Python toolkit enabling one-click alt text generation, validation, and embedding into notebooks. Experimental evaluation on real-world notebooks shows a 37.2% absolute improvement in alt text accuracy for complex charts, balancing automation efficiency with user configurability. To our knowledge, this is the first data-augmented, dual-driven (LLM + rule-based) framework for accessible computational notebooks.
📝 Abstract
We present MatplotAlt, an open-source Python package for easily adding alternative text to Matplotlib figures. MatplotAlt equips Jupyter notebook authors to automatically generate and surface chart descriptions with a single line of code or command, and supports a range of options that allow users to customize the generation and display of captions based on their preferences and accessibility needs. Our evaluation indicates that MatplotAlt's heuristic and LLM-based methods to generate alt text can create accurate long-form descriptions of both simple univariate and complex Matplotlib figures. We find that state-of-the-art LLMs still struggle with factual errors when describing charts, and improve the accuracy of our descriptions by prompting GPT4-turbo with heuristic-based alt text or data tables parsed from the Matplotlib figure.