🤖 AI Summary
While the self is a well-studied multidimensional psychological construct in cognitive science and phenomenology, it lacks computationally tractable and empirically verifiable linguistic representations in NLP. Method: We propose the first computational framework for systematically identifying “self-aspects” in text, grounded in an ontology of self-dimensions synthesized from cognitive science and phenomenology; we curate a corresponding annotated dataset and design a hybrid architecture integrating discriminative models, generative large language models, and embedding-based retrieval. Contribution/Results: Our approach achieves high-fidelity, interpretable, efficient, and ground-truth-aligned identification of self-dimensions. We validate its effectiveness in mental health analysis and empirical phenomenological case studies, establishing a novel paradigm for NLP to model subjective experience.
📝 Abstract
This Ph.D. proposal introduces a plan to develop a computational framework to identify Self-aspects in text. The Self is a multifaceted construct and it is reflected in language. While it is described across disciplines like cognitive science and phenomenology, it remains underexplored in natural language processing (NLP). Many of the aspects of the Self align with psychological and other well-researched phenomena (e.g., those related to mental health), highlighting the need for systematic NLP-based analysis. In line with this, we plan to introduce an ontology of Self-aspects and a gold-standard annotated dataset. Using this foundation, we will develop and evaluate conventional discriminative models, generative large language models, and embedding-based retrieval approaches against four main criteria: interpretability, ground-truth adherence, accuracy, and computational efficiency. Top-performing models will be applied in case studies in mental health and empirical phenomenology.