🤖 AI Summary
This study addresses the challenges of poor model interpretability and misalignment between human and artificial intelligence (AI) cognition by bridging principles of human cognitive neuroscience with AI modeling. We propose CLIP-HBA, a novel framework that integrates human behavioral embeddings and multi-level magnetoencephalography (MEG) neural data—both group-averaged and subject-specific—into the CLIP architecture. Through staged, individualized fine-tuning, CLIP-HBA dynamically models the perceptual-to-decision neural representation process. We introduce, for the first time, a millisecond-resolution neural dynamic alignment method and a cross-subject personalized modeling paradigm (CLIP-HBA-MEG). Experiments demonstrate significant improvements: +12.7% accuracy in predicting human similarity judgments, an average increase of 0.18 in R² for MEG time-series decoding, and—critically—the first brain-inspired AI model achieving both generalizability and subject-specific fidelity. This work establishes a unified formal framework bridging cognitive neuroscience and interpretable AI.
📝 Abstract
The integration of human and artificial intelligence represents a scientific opportunity to advance our understanding of information processing, as each system offers unique computational insights that can enhance and inform the other. The synthesis of human cognitive principles with artificial intelligence has the potential to produce more interpretable and functionally aligned computational models, while simultaneously providing a formal framework for investigating the neural mechanisms underlying perception, learning, and decision-making through systematic model comparisons and representational analyses. In this study, we introduce personalized brain-inspired modeling that integrates human behavioral embeddings and neural data to align with cognitive processes. We took a stepwise approach, fine-tuning the Contrastive Language-Image Pre-training (CLIP) model with large-scale behavioral decisions, group-level neural data, and finally, participant-level neural data within a broader framework that we have named CLIP-Human-Based Analysis (CLIP-HBA). We found that fine-tuning on behavioral data enhances its ability to predict human similarity judgments while indirectly aligning it with dynamic representations captured via MEG. To further gain mechanistic insights into the temporal evolution of cognitive processes, we introduced a model specifically fine-tuned on millisecond-level MEG neural dynamics (CLIP-HBA-MEG). This model resulted in enhanced temporal alignment with human neural processing while still showing improvement on behavioral alignment. Finally, we trained individualized models on participant-specific neural data, effectively capturing individualized neural dynamics and highlighting the potential for personalized AI systems. These personalized systems have far-reaching implications for the fields of medicine, cognitive research, human-computer interfaces, and AI development.