🤖 AI Summary
Existing methods for high-utility sequential rule mining struggle to guarantee monotonic growth of utility ratios, and the impact of newly added items on rule utility and confidence remains unclear. To address this, this work formally introduces the problem of mining high-utility sequential rules with monotonically increasing utility ratios and proposes the SRIU algorithm. SRIU employs a bidirectional left-right extension strategy and integrates several optimization techniques, including Item-Pair Estimated Utility Pruning (IPEUP), bitmap-based compressed storage, and compact utility tables, to significantly enhance mining efficiency. Experimental results on both real-world and synthetic datasets demonstrate that the rules discovered by SRIU consistently outperform those generated by state-of-the-art approaches in terms of confidence and conviction metrics.
📝 Abstract
Utility-driven mining is an essential task in data science, as it can provide deeper insight into the real world. High-utility sequential rule mining (HUSRM) aims at discovering sequential rules with high utility and high confidence. It can certainly provide reliable information for decision-making because it uses confidence as an evaluation metric, as well as some algorithms like HUSRM and US-Rule. However, in current rule-growth mining methods, the linkage between HUSRs and their generation remains ambiguous. Specifically, it is unclear whether the addition of new items affects the utility or confidence of the former rule, leading to an increase or decrease in their values. Therefore, in this paper, we formulate the problem of mining HUSRs with an increasing utility ratio. To address this, we introduce a novel algorithm called SRIU for discovering all HUSRs with an increasing utility ratio using two distinct expansion methods, including left-right expansion and right-left expansion. SRIU also utilizes the item pair estimated utility pruning strategy (IPEUP) to reduce the search space. Moreover, for the two expansion methods, two sets of upper bounds and corresponding pruning strategies are introduced. To enhance the efficiency of SRIU, several optimizations are incorporated. These include utilizing the Bitmap to reduce memory consumption and designing a compact utility table for the mining procedure. Finally, extensive experimental results from both real-world and synthetic datasets demonstrate the effectiveness of the proposed method. Moreover, to better assess the quality of the generated sequential rules, metrics such as confidence and conviction are employed, which further demonstrate that SRIU can improve the relevance of mining results.