From Internal Representations to Text Quality: A Geometric Approach to LLM Evaluation

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of reference-free text quality assessment. Methodologically, it systematically analyzes geometric properties—such as intrinsic dimensionality, effective rank, and Schatten norms—of intermediate activation representations across multiple layers of large language models (LLMs). Empirical analysis reveals strong correlations between these geometric measures and both textual naturalness and human-perceived quality, consistently across diverse LLMs and architectural layers. Crucially, the study identifies intrinsic dimensionality and effective rank as universal, robust quality indicators, enabling zero-shot, reference-free evaluation without human annotations or reference texts. Experiments demonstrate that the proposed method delivers stable and reliable assessments across heterogeneous generated text corpora; furthermore, different LLMs exhibit high consensus in ranking text quality using these metrics. The approach significantly outperforms existing reference-free metrics (e.g., MAUVE) and establishes a novel paradigm for interpreting the mapping between internal model representations and textual quality.

Technology Category

Application Category

📝 Abstract
This paper bridges internal and external analysis approaches to large language models (LLMs) by demonstrating that geometric properties of internal model representations serve as reliable proxies for evaluating generated text quality. We validate a set of metrics including Maximum Explainable Variance, Effective Rank, Intrinsic Dimensionality, MAUVE score, and Schatten Norms measured across different layers of LLMs, demonstrating that Intrinsic Dimensionality and Effective Rank can serve as universal assessments of text naturalness and quality. Our key finding reveals that different models consistently rank text from various sources in the same order based on these geometric properties, indicating that these metrics reflect inherent text characteristics rather than model-specific artifacts. This allows a reference-free text quality evaluation that does not require human-annotated datasets, offering practical advantages for automated evaluation pipelines.
Problem

Research questions and friction points this paper is trying to address.

Geometric properties of internal representations evaluate text quality
Intrinsic Dimensionality and Effective Rank assess text naturalness universally
Reference-free evaluation using inherent text characteristics without human annotation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometric properties of internal representations evaluate text quality
Intrinsic Dimensionality and Effective Rank assess text naturalness
Reference-free evaluation without human-annotated datasets
🔎 Similar Papers
No similar papers found.
V
Viacheslav Yusupov
D
Danil Maksimov
A
Ameliia Alaeva
A
Anna Vasileva
A
Anna Antipina
T
Tatyana Zaitseva
A
Alina Ermilova
Evgeny Burnaev
Evgeny Burnaev
Skoltech, Full Professor, Head of AI center, Head of research group, AIRI
Generative ModelingManifold LearningSurrogate Modeling3D Deep Learning
E
Egor Shvetsov