🤖 AI Summary
Despite the inherently relational nature of real-world phenomena—comprising entities, attributes, and relationships—and the ubiquity of structured data in databases and knowledge graphs, relational learning has failed to become a mainstream AI paradigm. This stems from dominant models’ overreliance on perceptual representations (e.g., pixels, text) while neglecting semantic modeling of symbolic identifiers (e.g., IDs, category names), leading to flattening or loss of relational structure.
Method: The paper systematically analyzes conventional machine learning’s flawed handling of identifiers and proposes a Statistical Relational AI framework that enables ontology-aware modeling of non-numerical identifiers and joint probabilistic inference over complex relational structures.
Contribution/Results: It identifies core barriers to deploying relational learning in practice and charts a viable path toward integrating relational priors with statistical learning—bypassing traditional feature engineering. The work provides theoretical foundations and practical guidelines for building AI systems that are more faithful to reality, interpretable, and generalizable.
📝 Abstract
AI seems to be taking over the world with systems that model pixels, words, and phonemes. The world is arguably made up, not of pixels, words, and phonemes but of entities (objects, things, including events) with properties and relations among them. Surely we should model these, not the perception or description of them. You might suspect that concentrating on modeling words and pixels is because all of the (valuable) data in the world is in terms of text and images. If you look into almost any company you will find their most valuable data is in spreadsheets, databases and other relational formats. These are not the form that are studied in introductory machine learning, but are full of product numbers, student numbers, transaction numbers and other identifiers that can't be interpreted naively as numbers. The field that studies this sort of data has various names including relational learning, statistical relational AI, and many others. This paper explains why relational learning is not taking over the world -- except in a few cases with restricted relations -- and what needs to be done to bring it to it's rightful prominence.