Discovering Semantic Subdimensions through Disentangled Conceptual Representations

📅 2025-08-29

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This study investigates fine-grained substructure within coarse-grained semantic categories (e.g., “animal”, “tool”). To this end, we propose the Decoupled Continuous Semantic Representation Model (DCSRM), the first data-driven approach to decompose large language model word embeddings into multiple interpretable sub-embeddings, each encoding a distinct semantic sub-dimension. We identify semantic polarity—such as size, animacy, and valence—as a key latent variable driving this decomposition. Our method integrates embedding disentanglement, interpretability analysis, and voxel-wise fMRI-based neural encoding modeling, with validation against empirical neuroimaging data. Results demonstrate that the identified sub-dimensions exhibit statistically significant and spatially specific cortical representations, markedly enhancing both the granularity of semantic representation and its cognitive and neurobiological interpretability.

Technology Category

Application Category

📝 Abstract

Understanding the core dimensions of conceptual semantics is fundamental to uncovering how meaning is organized in language and the brain. Existing approaches often rely on predefined semantic dimensions that offer only broad representations, overlooking finer conceptual distinctions. This paper proposes a novel framework to investigate the subdimensions underlying coarse-grained semantic dimensions. Specifically, we introduce a Disentangled Continuous Semantic Representation Model (DCSRM) that decomposes word embeddings from large language models into multiple sub-embeddings, each encoding specific semantic information. Using these sub-embeddings, we identify a set of interpretable semantic subdimensions. To assess their neural plausibility, we apply voxel-wise encoding models to map these subdimensions to brain activation. Our work offers more fine-grained interpretable semantic subdimensions of conceptual meaning. Further analyses reveal that semantic dimensions are structured according to distinct principles, with polarity emerging as a key factor driving their decomposition into subdimensions. The neural correlates of the identified subdimensions support their cognitive and neuroscientific plausibility.

Problem

Research questions and friction points this paper is trying to address.

Identifying fine-grained semantic subdimensions beyond broad predefined categories

Decomposing word embeddings to reveal interpretable conceptual distinctions

Mapping discovered subdimensions to neural representations for biological validation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposes word embeddings into sub-embeddings

Identifies interpretable semantic subdimensions through decomposition

Maps subdimensions to brain activation using encoding

🔎 Similar Papers

No similar papers found.