🤖 AI Summary
This work addresses the validity risks inherent in directly leveraging the geometric properties of semantic embeddings to measure social constructs in natural language processing, as such embeddings often conflate confounding factors like topic and stylistic variation. To mitigate this, the authors propose the Construct Validity Protocol (CVP), a rigorous framework that integrates causal representation learning with psychometric principles to bridge conceptual definitions and quantitative validation. Central to CVP is a novel counterfactual neutralization method based on large language models, designed to attenuate confounding effects. The framework further introduces a standardized test suite evaluating discriminant, incremental, and predictive validity, thereby establishing a verifiable and reproducible system for assessing and refining embedding-based construct measurements in computational social science, significantly enhancing their scientific rigor and reliability.
📝 Abstract
Natural Language Processing is rapidly evolving into a primary instrument for Computational Social Science, with researchers increasingly using embeddings to measure latent constructs such as novelty, creativity, and bias. However, this transition faces a fundamental validity challenge: the ''Proxy Presumption,'' or the reliance on geometric properties (e.g., cosine distance) as direct measures of social concepts. We argue that without explicit validation, unsupervised representations remain entangled mixtures of the target construct ($C$) and confounding attributes ($Z$) like topic, style, and authorship. To bridge the gap between semantic embeddings and valid social measures, we introduce the Construct Validity Protocol (CVP). Drawing on causal representation learning and psychometrics, the CVP offers a rigorous pipeline from conceptualization to quantitative verification. We further propose Counterfactual Neutralization, a novel method using LLMs to reduce confounding in embedding space. By providing a standardized Validity Suite -- including tests for discriminant, incremental, and predictive validity -- this work offers the community a toolkit to transform heuristic proxies into robust, scientifically defensible instruments.