🤖 AI Summary
Existing DFA learning approaches suffer from structural redundancy in monolithic models, while current DFA decomposition methods (DFA-DIPs) exhibit poor scalability due to reliance on augmented prefix-tree acceptors (APTAs) and standard SAT encodings.
Method: This paper proposes a compact representation framework based on ternary DFAs (3DFAs) to replace APTAs and eliminate structural redundancy; it further introduces an enhanced SAT encoding that jointly optimizes for Pareto efficiency and state minimality—enabling, for the first time, scalable computation of state-optimal DFA decompositions.
Results: Experiments demonstrate significant speedups in solving Pareto-optimal DIP instances and breakthrough scalability in state-optimal DIP tasks, surpassing prior methods in both efficiency and problem size capacity. The approach achieves a balanced trade-off among modularity, interpretability, and scalability, advancing the state of the art in DFA decomposition.
📝 Abstract
The identification of deterministic finite automata (DFAs) from labeled examples is a cornerstone of automata learning, yet traditional methods focus on learning monolithic DFAs, which often yield a large DFA lacking simplicity and interoperability. Recent work addresses these limitations by exploring DFA decomposition identification problems (DFA-DIPs), which model system behavior as intersections of multiple DFAs, offering modularity for complex tasks. However, existing DFA-DIP approaches depend on SAT encodings derived from Augmented Prefix Tree Acceptors (APTAs), incurring scalability limitations due to their inherent redundancy.
In this work, we advance DFA-DIP research through studying two variants: the traditional Pareto-optimal DIP and the novel states-optimal DIP, which prioritizes a minimal number of states. We propose a novel framework that bridges DFA decomposition with recent advancements in automata representation. One of our key innovations replaces APTA with 3-valued DFA (3DFA) derived directly from labeled examples. This compact representation eliminates redundancies of APTA, thus drastically reducing variables in the improved SAT encoding. Experimental results demonstrate that our 3DFA-based approach achieves significant efficiency gains for the Pareto-optimal DIP while enabling a scalable solution for the states-optimal DIP.