๐ค AI Summary
Existing low-dimensional text projections struggle to flexibly and interpretably reflect user-specified semantic relationships. This work proposes a large language model (LLM)-driven semantic guidance mechanism: users express their intent by grouping only a few document examples, and the system leverages an LLM to translate these groupings into natural language descriptions, which are then extended to related documents. The resulting semantic signals guide representation refinement through text augmentation or embedding fusion strategiesโwithout requiring model retraining. This approach enables explicit, interpretable, and continuously controllable reshaping of the projection space, effectively transforming the visualization canvas into an intent-driven semantic workspace. Experiments demonstrate that a single corpus can be reorganized from multiple semantic perspectives, and minimal user interaction substantially enhances both global and local alignment between the projection and the target semantic structure.
๐ Abstract
Low-dimensional projections of text embeddings support visual analysis of document collections, but their spatial organization may not reflect the relationships an analyst intends to examine. Existing semantic interaction approaches encode semantic intent indirectly through geometric constraints or model updates, limiting interpretability and flexibility. We introduce LLM-augmented semantic steering, which enables analysts to express semantic intent by grouping a small set of example documents within the projection. A large language model externalizes this intent as natural-language representations and selectively extends it to related documents; the resulting semantic information is then incorporated into document representations via text augmentation or embedding-level blending, without retraining the underlying models. A case study illustrates how the same corpus can be reorganized from different semantic perspectives, while simulation-based evaluation shows that semantic steering improves global and local alignment with target semantic structures using only minimal interaction. Embedding-level blending further enables continuous and controllable steering of projection layouts. These results position projection spaces as intent-dependent semantic workspaces that can be reshaped through explicit, interpretable, language-mediated interaction.