🤖 AI Summary
The impact of cost function selection on generalization performance in Inductive Logic Programming (ILP) lacks systematic empirical validation.
Method: Within a constraint-solving framework, we conduct the first large-scale evaluation of seven classical cost functions—including training error, description length, and hypothesis size—across over 20 domains and 1,000 learning tasks.
Contribution/Results: We find that cost functions exhibit a non-monotonic relationship with generalization error; no single function dominates uniformly across tasks. Minimizing training error or description length yields the lowest average generalization error, whereas reducing hypothesis size alone does not reliably improve generalization. Our results challenge the common assumption that “simplicity implies better generalization,” demonstrating that syntactic compactness is not inherently correlated with predictive accuracy. This work establishes an empirical benchmark for cost function selection in ILP and provides theoretical insights into the interplay between model complexity, encoding efficiency, and generalization.
📝 Abstract
Recent inductive logic programming (ILP) approaches learn optimal hypotheses. An optimal hypothesis minimises a given cost function on the training data. There are many cost functions, such as minimising training error, textual complexity, or the description length of hypotheses. However, selecting an appropriate cost function remains a key question. To address this gap, we extend a constraint-based ILP system to learn optimal hypotheses for seven standard cost functions. We then empirically compare the generalisation error of optimal hypotheses induced under these standard cost functions. Our results on over 20 domains and 1000 tasks, including game playing, program synthesis, and image reasoning, show that, while no cost function consistently outperforms the others, minimising training error or description length has the best overall performance. Notably, our results indicate that minimising the size of hypotheses does not always reduce generalisation error.