COLE$^+$: Towards Practical Column-based Learned Storage for Blockchain Systems

📅 2026-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of high storage overhead and low throughput in blockchain systems when supporting chain reorganization and state pruning. The authors propose an enhanced columnar learned storage scheme that innovatively integrates a rollback-capable in-memory tree structure with a two-layer prunable Merkle hash tree, enabling efficient concurrent support for both chain reorganization and state pruning within a columnar learned storage framework for the first time. By combining content-defined chunking, columnar storage, and learned indexing techniques, the proposed approach significantly reduces storage costs and improves throughput while maintaining practical system usability. Theoretical analysis and experimental evaluation demonstrate that the method effectively meets the critical operational requirements of real-world blockchain deployments.

Technology Category

Application Category

📝 Abstract
Blockchain provides a decentralized and tamper-resistant ledger for securely recording transactions across a network of untrusted nodes. While its transparency and integrity are beneficial, the substantial storage requirements for maintaining a complete transaction history present significant challenges. For example, Ethereum nodes require around 23TB of storage, with an annual growth rate of 4TB. Prior studies have employed various strategies to mitigate the storage challenges. Notably, COLE significantly reduces storage size and improves throughput by adopting a column-based design that incorporates a learned index, effectively eliminating data duplication in the storage layer. However, this approach has limitations in supporting chain reorganization during blockchain forks and state pruning to minimize storage overhead. In this paper, we propose COLE$^+$, an enhanced storage solution designed to address these limitations. COLE$^+$ incorporates a novel rewind-supported in-memory tree structure for handling chain reorganization, leveraging content-defined chunking (CDC) to maintain a consistent hash digest for each block. For on-disk storage, a new two-level Merkle Hash Tree (MHT) structure, called prunable version tree, is developed to facilitate efficient state pruning. Both theoretical and empirical analyses show the effectiveness of COLE$^+$ and its potential for practical application in real-world blockchain systems.
Problem

Research questions and friction points this paper is trying to address.

blockchain storage
chain reorganization
state pruning
learned storage
column-based storage
Innovation

Methods, ideas, or system contributions that make the work stand out.

column-based storage
learned index
chain reorganization
state pruning
Merkle Hash Tree
🔎 Similar Papers
No similar papers found.