Word Class Representations Spontaneously Emerge from Successor Representations Trained on Natural Language

📅 2026-05-23

📈 Citations: 0

✨ Influential: 0

career value

146K/year

🤖 AI Summary

This study investigates whether language models can spontaneously develop structured representations of word classes without explicit linguistic supervision. To this end, it introduces successor representations—a concept from reinforcement learning—into natural language processing for the first time. The model, implemented as a deep residual network, is trained on WikiText-103 by optimizing KL divergence to predict future word distributions across multiple temporal horizons. Results demonstrate that purely predictive learning enables the unsupervised emergence of geometrically coherent part-of-speech clusters: short-horizon predictions emphasize syntactic features, while long-horizon predictions integrate semantic context, revealing interpretable substructures within word classes. This work establishes novel connections among reinforcement learning, linguistics, and cognitive neuroscience.

📝 Abstract

Language models are typically trained to predict the next token in a sequence. Here, we explore an alternative predictive principle from reinforcement learning: Successor Representations (SRs), which model the expected discounted distribution of future states rather than the immediate next state. We transfer this framework to natural language and train neural networks to predict future word distributions across multiple temporal horizons, thereby learning representations of long-range transition structure. We train a deep residual neural network on WikiText-103 (103 million tokens; 20,000-word vocabulary) and optimize successor representations as probability distributions using KL divergence. Without explicit linguistic supervision, structured language representations emerge spontaneously. After training, the learned space develops a clear geometric organization with respect to part-of-speech (POS) categories: nouns, verbs, and adjectives become separable and recoverable through unsupervised clustering. This organization depends systematically on predictive horizon, with short horizons producing the strongest syntactic structure and longer horizons increasingly integrating broader contextual and semantic information. At finer resolutions, additional interpretable lexical substructure emerges, revealing coherent subclasses within major word categories. These findings suggest that syntactic categories need not be explicitly encoded but may arise as a consequence of predictive sequence learning. To our knowledge, this work provides the first systematic application of successor representations to natural language and establishes a conceptual bridge between reinforcement learning, linguistics, and cognitive neuroscience.

Problem

Research questions and friction points this paper is trying to address.

Successor Representations

Natural Language

Word Class

Predictive Learning

Syntactic Structure

Innovation

Methods, ideas, or system contributions that make the work stand out.

Successor Representations

predictive sequence learning

emergent word classes