🤖 AI Summary
This work addresses few-shot visual concept learning, aiming to emulate humans’ ability to rapidly induce compositional concepts from minimal examples. We propose a structured analogical learning framework: images are represented as object-relation graphs; concepts are modeled probabilistically via schemas over these graphs; an adaptive similarity metric dynamically weights object and relation matching; and a relation-selective enhancement mechanism simulates cognitive attention to improve generalization. The method integrates deep feature extraction, structured graph representation, probabilistic graphical modeling, and analogical mapping algorithms. Evaluated on standard few-shot classification benchmarks, our approach achieves human-level performance—matching or exceeding human accuracy—and significantly outperforms unstructured prototype-based and weakly structured baselines. These results empirically validate the necessity of structured representations and analogical reasoning for human-like rapid learning.
📝 Abstract
The ability to learn new visual concepts from limited examples is a hallmark of human cognition. While traditional category learning models represent each example as an unstructured feature vector, compositional concept learning is thought to depend on (1) structured representations of examples (e.g., directed graphs consisting of objects and their relations) and (2) the identification of shared relational structure across examples through analogical mapping. Here, we introduce Probabilistic Schema Induction (PSI), a prototype model that employs deep learning to perform analogical mapping over structured representations of only a handful of examples, forming a compositional concept called a schema. In doing so, PSI relies on a novel conception of similarity that weighs object-level similarity and relational similarity, as well as a mechanism for amplifying relations relevant to classification, analogous to selective attention parameters in traditional models. We show that PSI produces human-like learning performance and outperforms two controls: a prototype model that uses unstructured feature vectors extracted from a deep learning model, and a variant of PSI with weaker structured representations. Notably, we find that PSI's human-like performance is driven by an adaptive strategy that increases relational similarity over object-level similarity and upweights the contribution of relations that distinguish classes. These findings suggest that structured representations and analogical mapping are critical to modeling rapid human-like learning of compositional visual concepts, and demonstrate how deep learning can be leveraged to create psychological models.