🤖 AI Summary
This paper introduces the novel notion of “subsequence cover (s-cover)”: a word (C) is an s-cover of a target word (S) if (C) occurs as a subsequence at every position in (S).
Method: We define s-primitive words and establish tight exponential upper and lower bounds on their maximum length—improving prior upper bounds. We design a linear-time algorithm to decide whether a given candidate word is an s-cover of (S), and for constant-size alphabets, we provide a linear-time algorithm to compute a shortest s-cover.
Contribution/Results: Our work reveals fundamental computational distinctions between s-covers and classical string covers (based on factors) as well as shuffle powers, systematically characterizing the combinatorial structure of s-covers. The results extend classical string covering theory and introduce new tools for subsequence pattern matching and formal language analysis.
📝 Abstract
We introduce subsequence covers (s-covers, in short), a new type of covers of a word. A word $C$ is an s-cover of a word $S$ if the occurrences of $C$ in $S$ as subsequences cover all the positions in $S$.
The s-covers seem to be computationally much harder than standard covers of words (cf. Apostolico et al., Inf. Process. Lett. 1991), but, on the other hand, much easier than the related shuffle powers (Warmuth and Haussler, J. Comput. Syst. Sci. 1984).
We give a linear-time algorithm for testing if a candidate word $C$ is an s-cover of a word $S$ over a polynomially-bounded integer alphabet. We also give an algorithm for finding a shortest s-cover of a word $S$, which in the case of a constant-sized alphabet, also runs in linear time.
The words without proper s-cover are called s-primitive. We complement our algorithmic results with explicit lower and an upper bound on the length of a longest s-primitive word. Both bounds are exponential in the size of the alphabet. The upper bound presented here improves the bound given in the conference version of this paper [SPIRE 2022].