🤖 AI Summary
This work addresses dynamic pattern matching on highly repetitive strings, supporting efficient insertions and deletions.
Method: We propose the first dynamic variant of the r-index, built upon the run-length encoded Burrows–Wheeler Transform (RLBWT) and a dynamic LCP array, integrating balanced binary search trees and differential encoding for efficient index maintenance.
Contribution/Results: We introduce the first fully dynamic r-index, enabling edit operations in (O((m + L_{max}) log n)) time while preserving (O(r)) space usage and supporting locate queries in (O((m + ext{occ}) log n)) time. Experiments on diverse highly repetitive datasets confirm logarithmic-time updates and queries, with space strictly linear in the number (r) of RLBWT runs—thus achieving a favorable trade-off among compression, dynamic update support, and query efficiency.
📝 Abstract
A self-index is a compressed data structure that supports locate queries-reporting all positions where a given pattern occurs in a string. While many self-indexes have been proposed, developing dynamically updatable ones supporting string insertions and deletions remains a challenge. The r-index (Gagie et al., SODA'18) is a representative static self-index based on the run-length Burrows-Wheeler transform (RLBWT), designed for highly repetitive strings - those with many repeated substrings. We present the dynamic r-index, an extension of the r-index that supports locate queries in $mathcal{O}((m + occ) log n)$ time using $mathcal{O}(r)$ words, where $n$ is the length of the string $T$, $m$ is the pattern length, $occ$ is the number of occurrences, and $r$ is the number of runs in the RLBWT of $T$. It supports string insertions and deletions in $mathcal{O}((m + L_{max}) log n)$ time, where $L_{max}$ is the maximum value in the LCP array of $T$. The average running time is $mathcal{O}((m + L_{avg}) log n)$, where $L_{avg}$ is the average LCP value. We experimentally evaluated the dynamic r-index on various highly repetitive strings and demonstrated its practicality.