On the Ability of Transformers to Verify Plans

πŸ“… 2026-03-20
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the lack of theoretical guarantees for length generalization in Transformers when both sequence length and vocabulary size scale simultaneously in planning verification tasks. We propose the C*-RASP theoretical framework, which provides the first provable length generalization guarantee for decoder-only Transformers under this setting. By leveraging structural properties from classical AI planning domains, we characterize a class of planning domains that Transformers can reliably verify and identify key structural features governing their generalization capability. Our theoretical findings are empirically validated, demonstrating a strong correlation between the identified structural attributes and model performance.

Technology Category

Application Category

πŸ“ Abstract
Transformers have shown inconsistent success in AI planning tasks, and theoretical understanding of when generalization should be expected has been limited. We take important steps towards addressing this gap by analyzing the ability of decoder-only models to verify whether a given plan correctly solves a given planning instance. To analyse the general setting where the number of objects -- and thus the effective input alphabet -- grows at test time, we introduce C*-RASP, an extension of C-RASP designed to establish length generalization guarantees for transformers under the simultaneous growth in sequence length and vocabulary size. Our results identify a large class of classical planning domains for which transformers can provably learn to verify long plans, and structural properties that significantly affects the learnability of length generalizable solutions. Empirical experiments corroborate our theory.
Problem

Research questions and friction points this paper is trying to address.

Transformers
plan verification
length generalization
vocabulary growth
AI planning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformers
plan verification
length generalization
C*-RASP
AI planning
πŸ”Ž Similar Papers
No similar papers found.