🤖 AI Summary
Non-programmers face significant challenges in creating secure and efficient software script automations, as conventional approaches require programming expertise and API knowledge, while runtime code generation suffers from unverified outputs, security vulnerabilities, high latency, and substantial computational overhead.
Method: This paper proposes an offline simulation-driven framework for skill discovery and validation. It treats software script interfaces as system-level testbeds for large language models (LLMs), employs a graph neural network (GNN)-based API coordination prediction model to identify infrequent yet semantically valid API combinations, and integrates top-down functional guidance with bottom-up API coordination exploration—leveraging offline execution feedback for iterative script refinement.
Contribution/Results: Evaluated on Adobe Illustrator, the framework achieves markedly higher automation success rates, significantly reduced response latency, and substantially lower token consumption compared to baseline methods.
📝 Abstract
Scripting interfaces enable users to automate tasks and customize software workflows, but creating scripts traditionally requires programming expertise and familiarity with specific APIs, posing barriers for many users. While Large Language Models (LLMs) can generate code from natural language queries, runtime code generation is severely limited due to unverified code, security risks, longer response times, and higher computational costs. To bridge the gap, we propose an offline simulation framework to curate a software-specific skillset, a collection of verified scripts, by exploiting LLMs and publicly available scripting guides. Our framework comprises two components: (1) task creation, using top-down functionality guidance and bottom-up API synergy exploration to generate helpful tasks; and (2) skill generation with trials, refining and validating scripts based on execution feedback. To efficiently navigate the extensive API landscape, we introduce a Graph Neural Network (GNN)-based link prediction model to capture API synergy, enabling the generation of skills involving underutilized APIs and expanding the skillset's diversity. Experiments with Adobe Illustrator demonstrate that our framework significantly improves automation success rates, reduces response time, and saves runtime token costs compared to traditional runtime code generation. This is the first attempt to use software scripting interfaces as a testbed for LLM-based systems, highlighting the advantages of leveraging execution feedback in a controlled environment and offering valuable insights into aligning AI capabilities with user needs in specialized software domains.