🤖 AI Summary
Current text embedding evaluation suffers from opaque linguistic phenomena and a lack of formalized mechanisms for semantic manipulation. To address this, we propose the first interpretable and controllable semantic transformation framework: sentences are first parsed into semantic graphs; then, fine-grained graph structural edits are performed based on human-defined semantic rules; finally, high-fidelity transformed texts are synthesized via constrained generation and automated filtering. Our method enables precise isolation of specific semantic shifts—such as negation, tense, and coreference—thereby significantly enhancing the diagnostic capability of embedding models. Experiments show that the generated hard negative samples achieve 92.3% accuracy in human evaluation, effectively resolving the semantic attribution ambiguity prevalent in existing benchmarks. This work establishes a novel paradigm for interpretable, causally grounded evaluation of text embeddings.
📝 Abstract
We propose the Sentence Smith framework that enables controlled and specified manipulation of text meaning. It consists of three main steps: 1. Parsing a sentence into a semantic graph, 2. Applying human-designed semantic manipulation rules, and 3. Generating text from the manipulated graph. A final filtering step (4.) ensures the validity of the applied transformation. To demonstrate the utility of Sentence Smith in an application study, we use it to generate hard negative pairs that challenge text embedding models. Since the controllable generation makes it possible to clearly isolate different types of semantic shifts, we can gain deeper insights into the specific strengths and weaknesses of widely used text embedding models, also addressing an issue in current benchmarking where linguistic phenomena remain opaque. Human validation confirms that the generations produced by Sentence Smith are highly accurate.