🤖 AI Summary
This work proposes TabProbe, a model-agnostic framework that addresses the limitations of traditional association rule mining—such as rule explosion and poor scalability—and overcomes the performance degradation of existing neural methods under data sparsity. TabProbe is the first to leverage pre-trained tabular foundation models (TFMs) for association rule learning, directly generating rules via conditional probability modeling without requiring frequent itemset mining or task-specific training. The approach remains efficient and robust even in low-data regimes, producing concise, high-quality rules across multiple real-world tabular datasets of varying scales. Empirical evaluations demonstrate its superior performance on standard rule quality metrics and downstream classification tasks.
📝 Abstract
Association Rule Mining (ARM) is a fundamental task for knowledge discovery in tabular data and is widely used in high-stakes decision-making. Classical ARM methods rely on frequent itemset mining, leading to rule explosion and poor scalability, while recent neural approaches mitigate these issues but suffer from degraded performance in low-data regimes. Tabular foundation models (TFMs), pretrained on diverse tabular data with strong in-context generalization, provide a basis for addressing these limitations. We introduce a model-agnostic association rule learning framework that extracts association rules from any conditional probabilistic model over tabular data, enabling us to leverage TFMs. We then introduce TabProbe, an instantiation of our framework that utilizes TFMs as conditional probability estimators to learn association rules out-of-the-box without frequent itemset mining. We evaluate our approach on tabular datasets of varying sizes based on standard ARM rule quality metrics and downstream classification performance. The results show that TFMs consistently produce concise, high-quality association rules with strong predictive performance and remain robust in low-data settings without task-specific training. Source code is available at https://github.com/DiTEC-project/tabprobe.