Let's Predict Sentence by Sentence

📅 2025-05-28

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

This study investigates whether pretrained language models (PLMs) can transcend token-level reasoning to perform abstract, higher-order semantic reasoning—e.g., at the sentence or proposition level. To this end, we propose a sentence-level continuous autoregressive modeling framework. Methodologically, it introduces a novel dual-path embedding paradigm (semantic + contextual), designs discrete/continuous hybrid inference modes, and develops SentenceLens—a diagnostic tool for interpretability. The approach integrates PLM embedding-space adaptation, autoencoder-based semantic compression, next-sentence prediction for contextual modeling, and bidirectional embedding–text decoding. Evaluated on four reasoning-intensive domains—mathematics, logic, commonsense, and planning—the method matches chain-of-thought (CoT) performance while reducing average FLOPs by 50%. It further demonstrates strong scalability and modular adaptability. Our work establishes a new paradigm for advancing PLMs toward semantic-level reasoning.

Technology Category

Application Category

📝 Abstract

Autoregressive language models (LMs) generate one token at a time, yet human reasoning operates over higher-level abstractions - sentences, propositions, and concepts. This contrast raises a central question- Can LMs likewise learn to reason over structured semantic units rather than raw token sequences? In this work, we investigate whether pretrained LMs can be lifted into such abstract reasoning spaces by building on their learned representations. We present a framework that adapts a pretrained token-level LM to operate in sentence space by autoregressively predicting continuous embeddings of next sentences. We explore two embedding paradigms inspired by classical representation learning: 1) semantic embeddings, learned via autoencoding to preserve surface meaning; and 2) contextual embeddings, trained via next-sentence prediction to encode anticipatory structure. We evaluate both under two inference regimes: Discretized, which decodes each predicted embedding into text before re-encoding; and Continuous, which reasons entirely in embedding space for improved efficiency. Across four domains - mathematics, logic, commonsense, and planning - contextual embeddings under continuous inference show competitive performance with Chain-of-Thought (CoT) while reducing inference-time FLOPs on average by half. We also present early signs of scalability and modular adaptation. Finally, to visualize latent trajectories, we introduce SentenceLens, a diagnostic tool that decodes intermediate model states into interpretable sentences. Together, our results indicate that pretrained LMs can effectively transition to abstract, structured reasoning within latent embedding spaces.

Problem

Research questions and friction points this paper is trying to address.

Can autoregressive LMs reason over structured semantic units instead of tokens?

How to adapt token-level LMs to predict continuous sentence embeddings?

Do contextual embeddings improve efficiency and performance in abstract reasoning?

Innovation

Methods, ideas, or system contributions that make the work stand out.

Autoregressive prediction of continuous sentence embeddings

Semantic and contextual embedding paradigms

Continuous inference for efficiency and performance

🔎 Similar Papers

No similar papers found.