Learning DFAs from Positive Examples Only via Word Counting

📅 2025-11-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the classical problem of learning deterministic finite automata (DFA) from positive examples only—a setting motivated by black-box systems where negative examples are inaccessible. We propose a novel frequency-based approach: leveraging the count of accepted words up to a bounded length to infer the underlying DFA structure. For the first time, word-frequency statistics are incorporated into the computational complexity analysis of this problem, and we rigorously prove its NP-completeness. Building on this insight, we design a new algorithmic framework that integrates combinatorial optimization, automata theory, and minimal-accepting-word enumeration, achieving asymptotically superior time complexity compared to existing methods. Although its standalone accuracy currently lags behind state-of-the-art algorithms, when deployed as a preprocessing module, it significantly accelerates convergence and improves overall performance. Our work establishes a new paradigm for positive-example-driven modeling and verification of black-box systems.

Technology Category

Application Category

📝 Abstract
Learning finite automata from positive examples has recently gained attention as a powerful approach for understanding, explaining, analyzing, and verifying black-box systems. The motivation for focusing solely on positive examples arises from the practical limitation that we can only observe what a system is capable of (positive examples) but not what it cannot do (negative examples). Unlike the classical problem of passive DFA learning with both positive and negative examples, which has been known to be NP-complete since the 1970s, the topic of learning DFAs exclusively from positive examples remains poorly understood. This paper introduces a novel perspective on this problem by leveraging the concept of counting the number of accepted words up to a carefully determined length. Our contributions are twofold. First, we prove that computing the minimal number of words up to this length accepted by DFAs of a given size that accept all positive examples is NP-complete, establishing that learning from positive examples alone is computationally demanding. Second, we propose a new learning algorithm with a better asymptotic runtime that the best-known bound for existing algorithms. While our experimental evaluation reveals that this algorithm under-performs state-of-the-art methods, it demonstrates significant potential as a preprocessing step to enhance existing approaches.
Problem

Research questions and friction points this paper is trying to address.

Learning DFAs solely from positive examples via word counting
Proving NP-completeness of minimal word counting for DFA learning
Developing faster learning algorithms for positive-example-only DFA inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Learning DFAs from positive examples only
Using word counting up to specific length
Proposing new algorithm with better runtime
🔎 Similar Papers
No similar papers found.