🤖 AI Summary
Human infants rapidly acquire foundational social concepts—such as animacy and goal attribution—via unsupervised or few-shot learning, enabling robust future event prediction; in contrast, current AI models rely heavily on large-scale labeled datasets and exhibit poor generalization. Method: Inspired by developmental psychology, we formalize the evolutionary mechanism of conceptual hierarchies and propose a multi-stage neural framework integrating causal modeling, self-supervised representation learning, and concept-disentanglement regularization—where innate social concepts guide subsequent representation learning. Contribution/Results: Our approach substantially narrows the gap between AI and human-like conceptual structure: it improves accuracy by 12.7% on social prediction tasks, reduces data requirements by 65%, and yields representations that demonstrate significantly superior cross-task and cross-scenario generalization compared to baseline models.
📝 Abstract
Early in development, infants learn a range of useful concepts, which can be challenging from a computational standpoint. This early learning comes together with an initial understanding of aspects of the meaning of concepts, e.g., their implications, causality, and using them to predict likely future events. All this is accomplished in many cases with little or no supervision, and from relatively few examples, compared with current network models. In learning about objects and human-object interactions, early acquired and possibly innate concepts are often used in the process of learning additional, more complex concepts. In the current work, we model how early-acquired concepts are used in the learning of subsequent concepts, and compare the results with standard deep network modeling. We focused in particular on the use of the concepts of animacy and goal attribution in learning to predict future events. We show that the use of early concepts in the learning of new concepts leads to better learning (higher accuracy) and more efficient learning (requiring less data). We further show that this integration of early and new concepts shapes the representation of the concepts acquired by the model. The results show that when the concepts were learned in a human-like manner, the emerging representation was more useful, as measured in terms of generalization to novel data and tasks. On a more general level, the results suggest that there are likely to be basic differences in the conceptual structures acquired by current network models compared to human learning.