🤖 AI Summary
Existing differentiable inductive logic programming (ILP) approaches struggle to learn symbolic rules directly from raw continuous data—such as time-series or images—primarily due to the explicit label leakage problem: without supervision from feature-level labels, they cannot reliably map continuous inputs to symbolic variables. This work proposes an end-to-end neuro-symbolic framework that integrates self-supervised differentiable clustering with a novel differentiable ILP formulation, enabling direct learning of interpretable symbolic rules from raw data without requiring explicit labels. By circumventing the label leakage bottleneck for the first time, the method preserves rule interpretability while substantially improving generalization and applicability. Experiments on both temporal and visual tasks demonstrate its ability to discover accurate and intuitively meaningful symbolic rules.
📝 Abstract
Rule learning-based models are widely used in highly interpretable scenarios due to their transparent structures. Inductive logic programming (ILP), a form of machine learning, induces rules from facts while maintaining interpretability. Differentiable ILP models enhance this process by leveraging neural networks to improve robustness and scalability. However, most differentiable ILP methods rely on symbolic datasets, facing challenges when learning directly from raw data. Specifically, they struggle with explicit label leakage: The inability to map continuous inputs to symbolic variables without explicit supervision of input feature labels. In this work, we address this issue by integrating a self-supervised differentiable clustering model with a novel differentiable ILP model, enabling rule learning from raw data without explicit label leakage. The learned rules effectively describe raw data through its features. We demonstrate that our method intuitively and precisely learns generalized rules from time series and image data.