🤖 AI Summary
This paper addresses the challenge of modeling and learning data structures over infinite alphabets—such as nonces in cryptographic protocols or data values in XML. We propose the first active learning framework for bar automata, generalizing classical deterministic and nondeterministic finite automaton learning algorithms from finite to infinite alphabets. Central to our approach is α-equivalence checking, which bridges finite-alphabet semantics with infinite-alphabet behavior. The framework extends to bar Büchi automata and bar tree automata, enabling precise modeling and inference of finite data words with explicit name binding. Query complexity inherits that of the underlying learner, ensuring both theoretical soundness and practical applicability. To our knowledge, this is the first active learning method capable of uniformly handling both infinite-word languages and finite tree-shaped data languages. It establishes a new paradigm for formal verification and protocol reverse-engineering of data-sensitive systems.
📝 Abstract
Automata over infinite alphabets have emerged as a convenient computational model for processing structures involving data, such as nonces in cryptographic protocols or data values in XML documents. We introduce active learning methods for bar automata, a species of automata that process finite data words represented as bar strings, which are words with explicit name binding letters. Bar automata have pleasant algorithmic properties. We develop a framework in which every learning algorithm for standard deterministic or nondeterministic finite automata over finite alphabets can be used to learn bar automata, with a query complexity determined by that of the chosen learner. The technical key to our approach is the algorithmic handling of $alpha$-equivalence of bar strings, which allows to bridge the gap between finite and infinite alphabets. The principles underlying our framework are generic and also apply to bar B""uchi automata and bar tree automata, leading to the first active learning methods for data languages of infinite words and finite trees.