🤖 AI Summary
To address low sample efficiency in DFA learning, this paper proposes a language-augmented few-shot learning method that integrates natural-language prompts with sparse expert demonstrations. It pioneers the use of large language models (LLMs) as natural-language oracles to respond to membership queries in the L* algorithm, thereby enabling cross-modal synergy between demonstration-based learning and formal automaton inference. The approach unifies the L* framework, LLMs’ semantic reasoning capabilities, and an automated mechanism for converting demonstrations into labeled samples. In multi-task experiments, it reduces query counts by over 80% compared to demonstration-only baselines while significantly improving DFA synthesis accuracy. The core contribution is the first LLM-driven, cross-modal DFA inference paradigm, empirically demonstrating that natural language can substantially accelerate formal model learning.
📝 Abstract
Expert demonstrations have proven an easy way to indirectly specify complex tasks. Recent algorithms even support extracting unambiguous formal specifications, e.g. deterministic finite automata (DFA), from demonstrations. Unfortunately, these techniques are generally not sample efficient. In this work, we introduce $L^*LM$, an algorithm for learning DFAs from both demonstrations and natural language. Due to the expressivity of natural language, we observe a significant improvement in the data efficiency of learning DFAs from expert demonstrations. Technically, $L^*LM$ leverages large language models to answer membership queries about the underlying task. This is then combined with recent techniques for transforming learning from demonstrations into a sequence of labeled example learning problems. In our experiments, we observe the two modalities complement each other, yielding a powerful few-shot learner.