A Fast Algorithm for Computing Prefix Probabilities

📅 2023-06-04

🏛️ Annual Meeting of the Association for Computational Linguistics

📈 Citations: 2

✨ Influential: 0

career value

225K/year

🤖 AI Summary

This paper addresses the inefficiency of prefix probability computation for strings under probabilistic context-free grammars (PCFGs). We propose an accelerated variant of the Jelinek–Lafferty algorithm by reformulating the dynamic programming state transitions, incorporating matrix-based optimizations, and introducing a recursive prefix probability mechanism. The approach preserves exactness while reducing the time complexity from $O(n^3 |N|^3 + |N|^4)$ to $O(n^2 |N|^3 + n^3 |N|^2)$, where $n$ is the input string length and $|N|$ the number of nonterminals. Crucially, this eliminates the quartic dependency on $|N|$, achieving— for the first time theoretically—a quadratic reduction in grammar-size complexity. The improvement makes the method particularly suitable for large-scale PCFGs, enabling efficient prefix probability estimation in practical natural language processing applications such as incremental parsing and language modeling.

📝 Abstract

Multiple algorithms are known for efficiently calculating the prefix probability of a string under a probabilistic context-free grammar (PCFG). Good algorithms for the problem have a runtime cubic in the length of the input string. However, some proposed algorithms are suboptimal with respect to the size of the grammar.This paper proposes a new speed-up of Jelinek and Lafferty’s (1991) algorithm, which runs in O(n3|N|3 + |N|4), where n is the input length and |N| is the number of non-terminals in the grammar. In contrast, our speed-up runs in O(n2|N|3 + n3|N|2).

Problem

Research questions and friction points this paper is trying to address.

Efficiently compute prefix probabilities under PCFG

Reduce runtime complexity of existing algorithms

Optimize performance relative to grammar size

Innovation

Methods, ideas, or system contributions that make the work stand out.

Speed-up of Jelinek and Lafferty's algorithm

Reduces runtime complexity significantly

Optimizes for both input and grammar size

🔎 Similar Papers

No similar papers found.