GraSP: Graph-Structured Skill Compositions for LLM Agents

📅 2026-04-20

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

This work addresses the performance degradation of large language model agents as their skill repertoire expands, a problem rooted in the absence of effective skill orchestration mechanisms. To resolve this, the authors propose GraSP, an architecture that introduces a compilation layer between skill retrieval and execution. GraSP organizes a flat set of skills into a typed directed acyclic graph enriched with precondition-effect dependency edges. By incorporating node-level validation and five locally bounded repair operators, the framework reduces replanning complexity from O(N) to O(d^h). GraSP is the first method to produce executable skill graphs, achieving consistent improvements over baselines such as ReAct across four benchmarks—including ALFWorld—with up to a 19-point increase in task reward and a 41% reduction in interaction steps, while maintaining robustness under skill overload and degraded skill quality conditions.

Technology Category

Application Category

📝 Abstract

Skill ecosystems for LLM agents have matured rapidly, yet recent benchmarks show that providing agents with more skills does not monotonically improve performance -- focused sets of 2-3 skills outperform comprehensive documentation, and excessive skills actually hurt. The bottleneck has shifted from skill availability to skill orchestration: agents need not more skills, but a structural mechanism to select, compose, and execute them with explicit causal dependencies. We propose GraSP, the first executable skill graph architecture that introduces a compilation layer between skill retrieval and execution. GraSP transforms flat skill sets into typed directed acyclic graphs (DAGs) with precondition-effect edges, executes them with node-level verification, and performs locality-bounded repair through five typed operators -- reducing replanning from O(N) to O(d^h). Across ALFWorld, ScienceWorld, WebShop, and InterCode with eight LLM backbones, GraSP outperforms ReAct, Reflexion, ExpeL, and flat skill baselines in every configuration, improving reward by up to +19 points over the strongest baseline while cutting environment steps by up to 41%. GraSP's advantage grows with task complexity and is robust to both skill over-retrieval and quality degradation, confirming that structured orchestration -- not larger skill libraries -- is the key to reliable agent execution.

Problem

Research questions and friction points this paper is trying to address.

skill orchestration

LLM agents

structured composition

causal dependencies

executable skill graph

Innovation

Methods, ideas, or system contributions that make the work stand out.

skill graph

structured orchestration

executable DAG