Text Indexing and Pattern Matching with Ephemeral Edits

📅 2025-08-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses dynamic text indexing and pattern matching under ephemeral substring edits—namely, insertions, deletions, and substitutions—that persist only temporarily. We propose the first efficient indexing framework for this setting, requiring linear-time preprocessing of the text and *O*(*m* log log *m*) online preprocessing of a pattern *P*, where *m* = |*P*|. After each ephemeral edit, pattern matching queries report all *Occ* occurrences of *P* in *O*(log log *n* + *Occ*) time, with constant-time rollback of edits. Unlike conventional dynamic indexes, our solution achieves optimal query time *O*(*Occ*), eliminating logarithmic overheads in the output-sensitive term. This improvement significantly enhances responsiveness in hypothetical-edit analysis and streams of independent ephemeral edits. The framework thus provides both theoretical guarantees and practical foundations for real-time text analytics under transient modifications.

Technology Category

Application Category

📝 Abstract
A sequence $e_0,e_1,ldots$ of edit operations in a string $T$ is called ephemeral if operation $e_i$ constructing string $T^i$, for all $i=2k$ with $kinmathbb{N}$, is reverted by operation $e_{i+1}$ that reconstructs $T$. Such a sequence arises when processing a stream of independent edits or testing hypothetical edits. We introduce text indexing with ephemeral substring edits, a new version of text indexing. Our goal is to design a data structure over a given text that supports subsequent pattern matching queries with ephemeral substring insertions, deletions, or substitutions in the text; we require insertions and substitutions to be of constant length. In particular, we preprocess a text $T=T[0mathinner{.,.} n)$ over an integer alphabet $Σ=[0,σ)$ with $σ=n^{mathcal{O}(1)}$ in $mathcal{O}(n)$ time. Then, we can preprocess any arbitrary pattern $P=P[0mathinner{.,.} m)$ given online in $mathcal{O}(mloglog m)$ time and $mathcal{O}(m)$ space and allow any ephemeral sequence of edit operations in $T$. Before reverting the $i$th operation, we report all Occ occurrences of $P$ in $T^i$ in $mathcal{O}(loglog n + ext{Occ})$ time. We also introduce pattern matching with ephemeral edits. In particular, we preprocess two strings $T$ and $P$, each of length at most $n$, over an integer alphabet $Σ=[0,σ)$ with $σ=n^{mathcal{O}(1)}$ in $mathcal{O}(n)$ time. Then, we allow any ephemeral sequence of edit operations in $T$. Before reverting the $i$th operation, we report all Occ occurrences of $P$ in $T^i$ in the optimal $mathcal{O}( ext{Occ})$ time. Along our way to this result, we also give an optimal solution for pattern matching with ephemeral block deletions.
Problem

Research questions and friction points this paper is trying to address.

Efficient text indexing for ephemeral substring edits
Fast pattern matching with ephemeral sequence operations
Optimal reporting of pattern occurrences after edits
Innovation

Methods, ideas, or system contributions that make the work stand out.

Text indexing supports ephemeral substring edits
Preprocess pattern online for efficient matching
Optimal time reporting for pattern occurrences
🔎 Similar Papers
No similar papers found.