🤖 AI Summary
Existing approaches to agent workflow generation rely on costly iterative optimization in cross-domain scenarios, hindering efficient and stable generalization to novel domains. This work proposes a decompose-recompose-decide mechanism that learns reusable cross-domain workflow capability primitives, enabling tasks to be mapped into sparse combinations for one-shot, task-specific workflow generation. For the first time, composable cross-domain workflow capabilities are internalized within an open-source large language model, augmented with counterfactual attribution to identify critical capabilities. The end-to-end framework achieves superior one-shot performance across multiple domains, cross-domain settings, and unseen domains, outperforming state-of-the-art methods that require up to 20 iterations, while significantly reducing latency and computational overhead.
📝 Abstract
Automatically generating agentic workflows -- executable operator graphs or codes that orchestrate reasoning, verification, and repair -- has become a practical way to solve complex tasks beyond what single-pass LLM generation can reliably handle. Yet what constitutes a good workflow depends heavily on the task distribution and the available operators. Under domain shift, current systems typically rely on iterative workflow refinement to discover a feasible workflow from a large workflow space, incurring high iteration costs and yielding unstable, domain-specific behavior. In response, we internalize a decompose-recompose-decide mechanism into an open-source LLM for cross-domain workflow generation. To decompose, we learn a compact set of reusable workflow capabilities across diverse domains. To recompose, we map each input task to a sparse composition over these bases to generate a task-specific workflow in a single pass. To decide, we attribute the success or failure of workflow generation to counterfactual contributions from learned capabilities, thereby capturing which capabilities actually drive success by their marginal effects. Across stringent multi-domain, cross-domain, and unseen-domain evaluations, our 1-pass generator surpasses SOTA refinement baselines that consume 20 iterations, while substantially reducing generation latency and cost.