🤖 AI Summary
This study investigates the geometric nature of affective representations within large language models (LLMs). Despite growing interest in LLMs’ emotional capabilities, the intrinsic structure and cross-lingual consistency of affective representations remain poorly understood. Method: We conduct geometric analysis of multilayer hidden state spaces, complemented by linear probing, cross-lingual alignment, and a learnable intervention module. Results and Contribution: We identify a low-dimensional, directionally encoded, and layer-wise continuous affective manifold that exhibits strong generalization across five languages and eight real-world sentiment datasets, alongside cross-domain stability. Our approach yields the first interpretable and controllable affective latent space in LLMs—enabling fine-grained, semantics-preserving sentiment manipulation with low intervention error and high predictive accuracy. This work establishes both a theoretical foundation and a practical framework for controllable affective generation and cross-lingual sentiment understanding.
📝 Abstract
This work investigates how large language models (LLMs) internally represent emotion by analyzing the geometry of their hidden-state space. The paper identifies a low-dimensional emotional manifold and shows that emotional representations are directionally encoded, distributed across layers, and aligned with interpretable dimensions. These structures are stable across depth and generalize to eight real-world emotion datasets spanning five languages. Cross-domain alignment yields low error and strong linear probe performance, indicating a universal emotional subspace. Within this space, internal emotion perception can be steered while preserving semantics using a learned intervention module, with especially strong control for basic emotions across languages. These findings reveal a consistent and manipulable affective geometry in LLMs and offer insight into how they internalize and process emotion.