🤖 AI Summary
This paper addresses the active learning problem for symbolic Mealy machines under infinite input alphabets and multi-character outputs—the first framework to jointly handle both challenges. Methodologically, it introduces the notion of “critical input symbols”: within the Λ⁎_M active learning framework, it dynamically identifies and maintains a finite set of critical inputs via symbolic execution and counterexample analysis, thereby approximating the infinite-domain output functions associated with each state. Theoretically, the algorithm is proven to terminate under reasonable assumptions, and tight bounds on membership and equivalence query complexity are established. Experimentally, the approach achieves significantly higher query efficiency than baseline methods on real-world benchmarks and demonstrates strong scalability on randomly generated instances.
📝 Abstract
We propose $Λ^*_M$-an active learning algorithm that learns symbolic Mealy automata, which support infinite input alphabets and multiple output characters. Each of these two features has been addressed separately in prior work. Combining these two features poses a challenge in learning the outputs corresponding to potentially infinite sets of input characters at each state. To address this challenge, we introduce the notion of essential input characters, a finite set of input characters that is sufficient for learning the output function of a symbolic Mealy automaton. $Λ^*_M$ maintains an underapproximation of the essential input characters and refines this set during learning. We prove that $Λ^*_M$ terminates under certain assumptions. Moreover, we provide upper and lower bounds for the query complexity. Their similarity suggests the tightness of the bounds. We empirically demonstrate that $Λ^*_M$ is i) efficient regarding the number of queries on practical benchmarks and ii) scalable according to evaluations with randomly generated benchmarks.