🤖 AI Summary
Current AI emotional intelligence (EI) evaluation frameworks lack a rigorous theoretical foundation, predominantly adopting anthropocentric criteria that ignore AI systems’ inherent capability boundaries and measurability. This work systematically reviews emotion theory and human EI models, critically exposing fundamental flaws in prevailing benchmarks—particularly their ontological assumptions about emotion and ill-defined operationalizations of EI capabilities. We propose the first AI-centric definition of EI, specifying four empirically grounded, measurable core dimensions: affect perception, affect interpretation, contextually appropriate response generation, and cross-cultural adaptability. Building on this, we establish three principled evaluation criteria: elimination of subjective experience as a requirement, empirical verifiability, and scalability across architectures and modalities. The study thus introduces the first theoretically coherent, AI-specific EI framework and assessment paradigm—laying essential normative groundwork for scientifically valid, standardized, and reproducible AI-EI evaluation.
📝 Abstract
In this paper, we develop the position that current frameworks for evaluating emotional intelligence (EI) in artificial intelligence (AI) systems need refinement because they do not adequately or comprehensively measure the various aspects of EI relevant in AI. Human EI often involves a phenomenological component and a sense of understanding that artificially intelligent systems lack; therefore, some aspects of EI are irrelevant in evaluating AI systems. However, EI also includes an ability to sense an emotional state, explain it, respond appropriately, and adapt to new contexts (e.g., multicultural), and artificially intelligent systems can do such things to greater or lesser degrees. Several benchmark frameworks specialize in evaluating the capacity of different AI models to perform some tasks related to EI, but these often lack a solid foundation regarding the nature of emotion and what it is to be emotionally intelligent. In this project, we begin by reviewing different theories about emotion and general EI, evaluating the extent to which each is applicable to artificial systems. We then critically evaluate the available benchmark frameworks, identifying where each falls short in light of the account of EI developed in the first section. Lastly, we outline some options for improving evaluation strategies to avoid these shortcomings in EI evaluation in AI systems.