🤖 AI Summary
This paper investigates the construction of minimal deterministic suffix automata (DSAs) for regular languages. In a DSA, transition labels are arbitrary-length strings, and its size is defined as the sum of the number of states, the number of transitions, and the total length of all labels. We formally define the DSA model for the first time, propose a DFA-based DSA construction algorithm, and demonstrate that the DSA canonically derived from a minimal DFA is not necessarily globally minimal—revealing an inherent limitation of the standard construction. Furthermore, we prove that the problem of deciding, given a DFA and an integer (k), whether an equivalent DSA of total size at most (k) exists, is NP-complete. Our work establishes the semantic foundations and size metric for DSAs, provides bidirectional conversion algorithms between DSAs and DFAs, and determines the computational complexity lower bound for minimizing DSAs—thereby offering theoretical grounding and practical limits for compact regular language representations.
📝 Abstract
We introduce deterministic suffix-reading automata (DSA), a new automaton model over finite words. Transitions in a DSA are labeled with words. From a state, a DSA triggers an outgoing transition on seeing a word ending with the transition's label. Therefore, rather than moving along an input word letter by letter, a DSA can jump along blocks of letters, with each block ending in a suitable suffix. This feature allows DSAs to recognize regular languages more concisely, compared to DFAs. In this work, we focus on questions around finding a minimal DSA for a regular language. The number of states is not a faithful measure of the size of a DSA, since the transition-labels contain strings of arbitrary length. Hence, we consider total-size (number of states + number of edges + total length of transition-labels) as the size measure of DSAs.
We start by formally defining the model and providing a DSA-to-DFA conversion that allows to compare the expressiveness and succinctness of DSA with related automata models. Our main technical contribution is a method to derive DSAs from a given DFA: a DFA-to-DSA conversion. We make a surprising observation that the smallest DSA derived from the canonical DFA of a regular language L need not be a minimal DSA for L. This observation leads to a fundamental bottleneck in deriving a minimal DSA for a regular language. In fact, we prove that given a DFA and a number k, the problem of deciding if there exists an equivalent DSA of total-size atmost k is NP-complete.