🤖 AI Summary
This work addresses the scarcity of high-quality, diverse execution trajectories that hinders effective training of terminal-based agents, as existing task synthesis methods struggle to control trajectory diversity. To overcome this limitation, we propose SkillSynth, a novel framework that integrates skill graphs with a scene-mediated mechanism. By treating scenes as intermediary nodes in the skill graph, SkillSynth samples workflow paths and leverages multi-agent collaboration to instantiate them into executable terminal tasks, thereby explicitly controlling the minimal required diversity of execution trajectories. Departing from conventional paradigms that prioritize sheer quantity, SkillSynth demonstrates its efficacy on Terminal-Bench: tasks synthesized by our method significantly enhance the agent capabilities of the Hy3 Preview model in terminal environments.
📝 Abstract
Terminal agents have demonstrated strong potential for autonomous command-line execution, yet their training remains constrained by the scarcity of high-quality and diverse execution trajectories. Existing approaches mitigate this bottleneck by synthesizing large-scale terminal task instances for trajectory sampling. However, they primarily focus on scaling the number of tasks while providing limited control over the diversity of execution trajectories that agents actually experience during training. In this paper, we present SkillSynth, an automated framework for terminal task synthesis built on a scenario-mediated skill graph. SkillSynth first constructs a large-scale skill graph, where scenarios serve as intermediate transition nodes that connect diverse command-line skills. It then samples paths from this graph as abstractions of real-world workflows, and uses a multi-agent harness to instantiate them into executable task instances. By grounding task synthesis in graph-sampled workflow paths, SkillSynth explicitly controls the diversity of minimal execution trajectories required to solve the synthesized tasks. Experiments on Terminal-Bench demonstrate the effectiveness of SkillSynth. Moreover, task instances synthesized by SkillSynth have been adopted to train Hy3 Preview, contributing to its enhanced agentic capabilities in terminal-based settings.