Deterministic Suffix-reading Automata

📅 2025-05-14

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This paper investigates the construction of minimal deterministic suffix automata (DSAs) for regular languages. In a DSA, transition labels are arbitrary-length strings, and its size is defined as the sum of the number of states, the number of transitions, and the total length of all labels. We formally define the DSA model for the first time, propose a DFA-based DSA construction algorithm, and demonstrate that the DSA canonically derived from a minimal DFA is not necessarily globally minimal—revealing an inherent limitation of the standard construction. Furthermore, we prove that the problem of deciding, given a DFA and an integer (k), whether an equivalent DSA of total size at most (k) exists, is NP-complete. Our work establishes the semantic foundations and size metric for DSAs, provides bidirectional conversion algorithms between DSAs and DFAs, and determines the computational complexity lower bound for minimizing DSAs—thereby offering theoretical grounding and practical limits for compact regular language representations.

Technology Category

Application Category

📝 Abstract

We introduce deterministic suffix-reading automata (DSA), a new automaton model over finite words. Transitions in a DSA are labeled with words. From a state, a DSA triggers an outgoing transition on seeing a word ending with the transition's label. Therefore, rather than moving along an input word letter by letter, a DSA can jump along blocks of letters, with each block ending in a suitable suffix. This feature allows DSAs to recognize regular languages more concisely, compared to DFAs. In this work, we focus on questions around finding a minimal DSA for a regular language. The number of states is not a faithful measure of the size of a DSA, since the transition-labels contain strings of arbitrary length. Hence, we consider total-size (number of states + number of edges + total length of transition-labels) as the size measure of DSAs. We start by formally defining the model and providing a DSA-to-DFA conversion that allows to compare the expressiveness and succinctness of DSA with related automata models. Our main technical contribution is a method to derive DSAs from a given DFA: a DFA-to-DSA conversion. We make a surprising observation that the smallest DSA derived from the canonical DFA of a regular language L need not be a minimal DSA for L. This observation leads to a fundamental bottleneck in deriving a minimal DSA for a regular language. In fact, we prove that given a DFA and a number k, the problem of deciding if there exists an equivalent DSA of total-size atmost k is NP-complete.

Problem

Research questions and friction points this paper is trying to address.

Introducing deterministic suffix-reading automata (DSA) for recognizing regular languages more concisely than DFAs

Investigating minimal DSA construction for regular languages using total-size as the size measure

Proving NP-completeness of determining equivalent DSA existence within a given total-size limit

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces deterministic suffix-reading automata (DSA)

Uses word-labeled transitions for concise recognition

Proves NP-completeness of minimal DSA derivation

🔎 Similar Papers

Constructing a BPE Tokenization DFA