๐ค AI Summary
Existing studies rely on small-scale, manually constructed experiments to evaluate how checkpointing affects data exploration efficiency in computational notebooks, limiting generalizability to complex, real-world scenarios. To address this, we propose an AI-agent-based large-scale automated evaluation frameworkโthe first to enable reproducible simulation across over 1,000 exploration paths and nearly 3,000 code blocks, accurately modeling backtracking and branching exploration behaviors. The framework integrates a notebook execution engine, state snapshotting, and path-tracking mechanisms to support fine-grained analysis of execution efficiency. Experimental results demonstrate that checkpointing significantly reduces redundant re-execution and repeated variable computation, yielding an average 37% improvement in execution efficiency across more than 1,000 exploration paths. This work overcomes the scalability limitations of prior evaluation methodologies and establishes a new paradigm for optimizing interactive data analysis systems.
๐ Abstract
Saving, or checkpointing, intermediate results during interactive data exploration can potentially boost user productivity. However, existing studies on this topic are limited, as they primarily rely on small-scale experiments with human participants - a fundamental constraint of human subject studies. To address this limitation, we employ AI agents to simulate a large number of complex data exploration scenarios, including revisiting past states and branching into new exploration paths. This strategy enables us to accurately assess the impact of checkpointing while closely mimicking the behavior of real-world data practitioners. Our evaluation results, involving more than 1,000 exploration paths and 2,848 executed code blocks, show that a checkpointing framework for computational notebooks can indeed enhance productivity by minimizing unnecessary code re-executions and redundant variables or code.