yProv4DV: Reproducible Data Visualization Scripts Out of the Box

📅 2026-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of reproducibility in data visualization scripts, which often lack essential components such as source code, input data, execution environment, or output artifacts. To bridge this gap, we propose yProv4DV—a lightweight Python library that, for the first time, targets script-based visualization workflows by automatically capturing comprehensive provenance information—including source code, input data, runtime environment, and output results—through a single function call. Designed to be minimally invasive and ready-to-use, yProv4DV enables full reproducibility of visualization outputs without requiring any modification to existing scripts. This approach significantly reduces the development burden on researchers striving to ensure reproducibility and fills a critical void in automated provenance support within visualization pipelines.

Technology Category

Application Category

📝 Abstract
While results visualization is a critical phase to the communication of new academic results, plots are frequently shared without the complete combination of code, input data, execution context and outputs required to independently reproduce the resulting figures. Existing reproducibility solutions tend to focus on computational pipelines or workflow management systems, not covering script-based visualization practices commonly used by researchers and practitioners. Additionally, the minimalist nature of current Python data visualization libraries tend to speed up the creation of images, disincentivizing users from spending time integrating additional tools into these short scripts. This paper proposes yProv4DV, a library lightweight designed to enable reproducible data visualization scripts through the use of provenance information, minimizing the necessity for code modifications. Through a single call, users can track inputs, outputs and source code files, enabling saving and full reproducibility of their data visualization software. As a result, this library fills a gap in reproducible research workflows by addressing the reproducibility of plots in scientific publications.
Problem

Research questions and friction points this paper is trying to address.

reproducibility
data visualization
scientific plots
provenance
script-based visualization
Innovation

Methods, ideas, or system contributions that make the work stand out.

reproducible visualization
provenance tracking
lightweight library
scientific reproducibility
data visualization
🔎 Similar Papers
No similar papers found.
G
Gabriele Padovani
University of Trento, Italy
Sandro Fiore
Sandro Fiore
University of Trento