🤖 AI Summary
To address the high concealment of Web3 scam accounts and the limitations of existing methods—namely, their neglect of temporal dynamics and poor adaptability to power-law-distributed transaction networks—this paper proposes a dynamic transaction graph evolution modeling framework. Methodologically, we introduce the first structure-temporal joint random walk sampling strategy and design a directed subgraph encoder coupled with a variational Transformer to effectively model the evolutionary dynamics of transaction graph sequences. Our contributions include: (1) constructing the first large-scale Web3 transaction benchmark dataset covering scam, phishing, and legitimate accounts; and (2) achieving state-of-the-art performance—weighted F1-score of 0.76 (+17.29%) for scam detection and F1-score of 0.97 (+17.5%) for phishing node detection—outperforming leading models such as SIEGE and EthIdent.
📝 Abstract
The web3 applications have recently been growing, especially on the Ethereum platform, starting to become the target of scammers. The web3 scams, imitating the services provided by legitimate platforms, mimic regular activity to deceive users. However, previous studies have primarily concentrated on de-anonymization and phishing nodes, neglecting the distinctive features of web3 scams. Moreover, the current phishing account detection tools utilize graph learning or sampling algorithms to obtain graph features. However, large-scale transaction networks with temporal attributes conform to a power-law distribution, posing challenges in detecting web3 scams. To overcome these challenges, we present ScamSweeper, a novel framework that emphasizes the dynamic evolution of transaction graphs, to identify web3 scams on Ethereum. ScamSweeper samples the network with a structure temporal random walk, which is an optimized sample walking method that considers both temporal attributes and structural information. Then, the directed graph encoder generates the features of each subgraph during different temporal intervals, sorting as a sequence. Moreover, a variational Transformer is utilized to extract the dynamic evolution in the subgraph sequence. Furthermore, we collect a large-scale transaction dataset consisting of web3 scams, phishing, and normal accounts, which are from the first 18 million block heights on Ethereum. Subsequently, we comprehensively analyze the distinctions in various attributes, including nodes, edges, and degree distribution. Our experiments indicate that ScamSweeper outperforms SIEGE, Ethident, and PDTGA in detecting web3 scams, achieving a weighted F1-score improvement of at least 17.29% with the base value of 0.59. In addition, ScamSweeper in phishing node detection achieves at least a 17.5% improvement over DGTSG and BERT4ETH in F1-score from 0.80.