Invariant Causal Set Covering Machines

📅 2023-06-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This paper addresses the limitation of rule-based models (e.g., decision trees) in capturing spurious correlations and failing to identify causal relationships. To overcome this, we propose a novel method integrating invariant causal prediction with the set-covering machine framework. Our approach is the first to embed environment-invariance constraints into binary conjunctive/disjunctive rule learning and introduce a causal sufficiency test, enabling polynomial-time theoretical guarantees for identifying the causal parents of the target variable. Unlike conventional interpretable models, our method achieves both strong interpretability and causal robustness. We provide formal theoretical proofs of correct causal identification and empirically demonstrate significant improvements over state-of-the-art baselines across multiple synthetic and real-world datasets. The method consistently extracts valid causal rules under challenging conditions—including label noise, covariate shift, and distributional shifts—while preserving computational efficiency and transparency.

📝 Abstract

Rule-based models, such as decision trees, appeal to practitioners due to their interpretable nature. However, the learning algorithms that produce such models are often vulnerable to spurious associations and thus, they are not guaranteed to extract causally-relevant insights. In this work, we build on ideas from the invariant causal prediction literature to propose Invariant Causal Set Covering Machines, an extension of the classical Set Covering Machine algorithm for conjunctions/disjunctions of binary-valued rules that provably avoids spurious associations. We demonstrate both theoretically and empirically that our method can identify the causal parents of a variable of interest in polynomial time.

Problem

Research questions and friction points this paper is trying to address.

Develop interpretable rule-based models avoiding spurious associations

Extend Set Covering Machines to identify causal relationships

Prove polynomial-time causal parent identification theoretically and empirically

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends Set Covering Machine algorithm

Avoids spurious associations provably

Identifies causal parents polynomially

🔎 Similar Papers

The Causal Information Bottleneck and Optimal Causal Variable Abstractions