🤖 AI Summary
Cross-chain bridge transactions lack explicit pairing records, severely impeding fund tracing, vulnerability detection, and graph-based analysis in multi-chain environments. To address this, we propose the first general-purpose cross-chain transaction pair identification framework integrating large language models (LLMs) with a lightweight verification module: the LLM performs semantic search-space pruning to handle lexical and contextual ambiguity, while the verification module enforces value-consistency constraints for high-precision matching. Our approach achieves an F1-score of 0.9746 on 500,000 real-world transactions—improving upon state-of-the-art baselines by 20.05%—and reduces search space by over three orders of magnitude. It successfully identified a multi-million-dollar illicit cross-chain fund transfer. This work delivers a scalable, robust, and fully automated foundation for multi-chain security analytics.
📝 Abstract
As the Web3 ecosystem evolves toward a multi-chain architecture, cross-chain bridges have become critical infrastructure for enabling interoperability between diverse blockchain networks. However, while connecting isolated blockchains, the lack of cross-chain transaction pairing records introduces significant challenges for security analysis like cross-chain fund tracing, advanced vulnerability detection, and transaction graph-based analysis. To address this gap, we introduce ConneX, an automated and general-purpose system designed to accurately identify corresponding transaction pairs across both ends of cross-chain bridges. Our system leverages Large Language Models (LLMs) to efficiently prune the semantic search space by identifying semantically plausible key information candidates within complex transaction records. Further, it deploys a novel examiner module that refines these candidates by validating them against transaction values, effectively addressing semantic ambiguities and identifying the correct semantics. Extensive evaluations on a dataset of about 500,000 transactions from five major bridge platforms demonstrate that ConneX achieves an average F1 score of 0.9746, surpassing baselines by at least 20.05%, with good efficiency that reduces the semantic search space by several orders of magnitude (1e10 to less than 100). Moreover, its successful application in tracing illicit funds (including a cross-chain transfer worth $1 million) in real-world hacking incidents underscores its practical utility for enhancing cross-chain security and transparency.