🤖 AI Summary
This work addresses the lack of interpretability in black-box neural networks for preference learning by proposing an approximation method that integrates Inductive Logic Programming via Answer Set Programming (ILASP) with Principal Component Analysis (PCA). PCA is employed to reduce the dimensionality of high-dimensional preference data, while ILASP introduces weak constraints to approximate the neural network’s behavior through global or local logical rules. Evaluated on a custom recipe preference dataset, the approach yields high-fidelity post-hoc explanations, significantly enhancing model transparency without compromising computational efficiency. The method thus offers a principled solution that balances accuracy and interpretability for neural preference models.
📝 Abstract
In this paper, we propose using Learning from Answer Sets to approximate black-box models, such as Neural Networks (NN), in the specific case of learning user preferences. We specifically explore the use of ILASP (Inductive Learning of Answer Set Programs) to approximate preference learning systems through weak constraints. We have created a dataset on user preferences over a set of recipes, which is used to train the NNs that we aim to approximate with ILASP. Our experiments investigate ILASP both as a global and a local approximator of the NNs. These experiments address the challenge of approximating NNs working on increasingly high-dimensional feature spaces while achieving appropriate fidelity on the target model and limiting the increase in computational time. To handle this challenge, we propose a preprocessing step that exploits Principal Component Analysis to reduce the dataset's dimensionality while keeping our explanations transparent. Under consideration for publication in Theory and Practice of Logic Programming (TPLP).