🤖 AI Summary
This paper addresses the fundamental challenge in formal language learning of identifying relational pattern languages (L) solely from positive examples—strings belonging to (L)—without negative evidence. We introduce, for the first time, the notion of a *positive characteristic set*: a minimal set of positive examples that uniquely identifies (L) within a given language class. Methodologically, we construct polynomial-size positive characteristic sets by leveraging formal representations of relational pattern languages and analyzing their identifiability properties, thereby establishing necessary and sufficient conditions for their existence. Our main contribution is twofold: (i) it overcomes the classical requirement for both positive and negative examples, enabling efficient and exact identification of (L) from positive data alone; and (ii) it systematically introduces and develops the theory of positive characteristic sets for relational pattern languages, providing both a theoretical foundation and a constructive framework for positive-example-driven language identification.
📝 Abstract
In the context of learning formal languages, data about an unknown target language L is given in terms of a set of (word,label) pairs, where a binary label indicates whether or not the given word belongs to L. A (polynomial-size) characteristic set for L, with respect to a reference class L of languages, is a set of such pairs that satisfies certain conditions allowing a learning algorithm to (efficiently) identify L within L. In this paper, we introduce the notion of positive characteristic set, referring to characteristic sets of only positive examples. These are of importance in the context of learning from positive examples only. We study this notion for classes of relational pattern languages, which are of relevance to various applications in string processing.