🤖 AI Summary
This paper addresses the inefficiency of prefix probability computation for strings under probabilistic context-free grammars (PCFGs). We propose an accelerated variant of the Jelinek–Lafferty algorithm by reformulating the dynamic programming state transitions, incorporating matrix-based optimizations, and introducing a recursive prefix probability mechanism. The approach preserves exactness while reducing the time complexity from $O(n^3 |N|^3 + |N|^4)$ to $O(n^2 |N|^3 + n^3 |N|^2)$, where $n$ is the input string length and $|N|$ the number of nonterminals. Crucially, this eliminates the quartic dependency on $|N|$, achieving— for the first time theoretically—a quadratic reduction in grammar-size complexity. The improvement makes the method particularly suitable for large-scale PCFGs, enabling efficient prefix probability estimation in practical natural language processing applications such as incremental parsing and language modeling.
📝 Abstract
Multiple algorithms are known for efficiently calculating the prefix probability of a string under a probabilistic context-free grammar (PCFG). Good algorithms for the problem have a runtime cubic in the length of the input string. However, some proposed algorithms are suboptimal with respect to the size of the grammar.This paper proposes a new speed-up of Jelinek and Lafferty’s (1991) algorithm, which runs in O(n3|N|3 + |N|4), where n is the input length and |N| is the number of non-terminals in the grammar. In contrast, our speed-up runs in O(n2|N|3 + n3|N|2).