XChainDataGen: A Cross-Chain Dataset Generation Framework

📅 2025-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
A lack of publicly available, high-quality cross-chain transaction datasets and associated construction tools hinders empirical research on cross-chain interoperability protocols. Method: We introduce the first open-source cross-chain transaction dataset (35 GB), covering 11 blockchains, 5 protocol categories, 11.28 million real cross-chain transactions (CCTXs), and $28 billion in transferred assets. We propose an automated collection and structured generation framework integrating node APIs, multi-protocol parsers, transaction provenance graphs, and temporal metadata annotation. Contribution/Results: Our work is the first to systematically reveal the differential security implications of finality types—full versus soft finality—and the first to quantitatively evaluate EIP-7683’s impact on cross-chain intent processing performance. The dataset enables comparable, evidence-based analysis of cross-chain protocols across three dimensions: security, transaction cost, and execution performance—advancing empirical research paradigms for DeFi and cross-chain infrastructure.

Technology Category

Application Category

📝 Abstract
The number of blockchain interoperability protocols for transferring data and assets between blockchains has grown significantly. However, no open dataset of cross-chain transactions exists to study interoperability protocols in operation. There is also no tool to generate such datasets and make them available to the community. This paper proposes XChainDataGen, a tool to extract cross-chain data from blockchains and generate datasets of cross-chain transactions (cctxs). Using XChainDataGen, we extracted over 35 GB of data from five cross-chain protocols deployed on 11 blockchains in the last seven months of 2024, identifying 11,285,753 cctxs that moved over 28 billion USD in cross-chain token transfers. Using the data collected, we compare protocols and provide insights into their security, cost, and performance trade-offs. As examples, we highlight differences between protocols that require full finality on the source blockchain and those that only demand soft finality ( extit{security}). We compare user costs, fee models, and the impact of variables such as the Ethereum gas price on protocol fees ( extit{cost}). Finally, we produce the first analysis of the implications of EIP-7683 for cross-chain intents, which are increasingly popular and greatly improve the speed with which cctxs are processed ( extit{performance}), thereby enhancing the user experience. The availability of XChainDataGen and this dataset allows various analyses, including trends in cross-chain activity, security assessments of interoperability protocols, and financial research on decentralized finance (DeFi) protocols.
Problem

Research questions and friction points this paper is trying to address.

Lack of open cross-chain transaction datasets
No tool to generate cross-chain transaction datasets
Need for analyzing security, cost, and performance of cross-chain protocols
Innovation

Methods, ideas, or system contributions that make the work stand out.

Develops XChainDataGen for cross-chain data extraction
Generates datasets of cross-chain transactions (cctxs)
Analyzes security, cost, and performance of protocols
🔎 Similar Papers
No similar papers found.