🤖 AI Summary
This work proposes a knowledge–data-driven machine learning (KD-ML) framework to enhance the generalization capability and physical consistency of learning models. Introducing the novel concept of “knowledge landmarks,” the approach integrates high-level abstract knowledge in the form of information granules with local numerical data in a complementary manner. A joint loss function is formulated, combining a data-fitting term with a granular regularization term whose weight is adjustable, thereby enabling synergistic optimization of prior knowledge and observed data. Experimental results on two physics-based benchmark tasks demonstrate that KD-ML significantly outperforms purely data-driven methods, confirming the effectiveness and innovation of the proposed framework in seamlessly unifying domain knowledge with empirical observations.
📝 Abstract
Informed Machine Learning has emerged as a viable generalization of Machine Learning (ML) by building a unified conceptual and algorithmic setting for constructing models on a unified basis of knowledge and data. Physics-informed ML involving physics equations is one of the developments within Informed Machine Learning. This study proposes a novel direction of Knowledge-Data ML, referred to as KD-ML, where numeric data are integrated with knowledge tidbits expressed in the form of granular knowledge landmarks. We advocate that data and knowledge are complementary in several fundamental ways: data are precise (numeric) and local, usually confined to some region of the input space, while knowledge is global and formulated at a higher level of abstraction. The knowledge can be represented as information granules and organized as a collection of input-output information granules called knowledge landmarks. In virtue of this evident complementarity, we develop a comprehensive design process of the KD-ML model and formulate an original augmented loss function L, which additively embraces the component responsible for optimizing the model based on available numeric data, while the second component, playing the role of a granular regularizer, so that it adheres to the granular constraints (knowledge landmarks). We show the role of the hyperparameter positioned in the loss function, which balances the contribution and guiding role of data and knowledge, and point to some essential tendencies associated with the quality of data (noise level) and the level of granularity of the knowledge landmarks. Experiments on two physics-governed benchmarks demonstrate that the proposed KD model consistently outperforms data-driven ML models.