Kishu: Time-Traveling for Computational Notebooks

📅 2024-06-19
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing computational notebooks (e.g., Jupyter) execute cells in-place, irreversibly modifying session state and lacking efficient, reliable backtracking. Conventional memory snapshotting or session dumping suffers from high storage overhead, frequent restoration failures, and substantial latency. This paper introduces “Time Travel,” a novel system for computational notebooks that pioneers covariable-granularity incremental snapshots—precisely modeling inter-cell variable dependencies—and supports 146 data-science object types across Ray, Spark, PyTorch, and others. By combining incremental checkpointing with sub-second state detection, it enables arbitrary historical state restoration with minimal data loading. Experiments demonstrate up to 4.55× reduction in snapshot size and up to 9.02× speedup in detection latency, while ensuring seamless integration across the full stack of data science libraries.

Technology Category

Application Category

📝 Abstract
Computational notebooks (e.g., Jupyter, Google Colab) are widely used by data scientists. A key feature of notebooks is the interactive computing model of iteratively executing cells (i.e., a set of statements) and observing the result (e.g., model or plot). Unfortunately, existing notebook systems do not offer time-traveling to past states: when the user executes a cell, the notebook session state consisting of user-defined variables can be irreversibly modified - e.g., the user cannot 'un-drop' a dataframe column. This is because, unlike DBMS, existing notebook systems do not keep track of the session state. Existing techniques for checkpointing and restoring session states, such as OS-level memory snapshot or application-level session dump, are insufficient: checkpointing can incur prohibitive storage costs and may fail, while restoration can only be inefficiently performed from scratch by fully loading checkpoint files. In this paper, we introduce a new notebook system, Kishu, that offers time-traveling to and from arbitrary notebook states using an efficient and fault-tolerant incremental checkpoint and checkout mechanism. Kishu creates incremental checkpoints that are small and correctly preserve complex inter-variable dependencies at a novel Co-variable granularity. Then, to return to a previous state, Kishu accurately identifies the state difference between the current and target states to perform incremental checkout at sub-second latency with minimal data loading. Kishu is compatible with 146 object classes from popular data science libraries (e.g., Ray, Spark, PyTorch), and reduces checkpoint size and checkout time by up to 4.55x and 9.02x, respectively, on a variety of notebooks.
Problem

Research questions and friction points this paper is trying to address.

Enables time-traveling to past states in computational notebooks.
Addresses irreversible modifications in notebook session states.
Improves checkpoint and restoration efficiency with incremental mechanisms.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Incremental checkpointing for efficient state preservation
Sub-second latency incremental checkout mechanism
Compatibility with 146 data science object classes
🔎 Similar Papers
No similar papers found.